Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

aayush garg's picture

In a Training Loop 🔄

aayush garg

garg-aayush

leokmax's profile picture

junyaoren's profile picture

telcom's profile picture

·

https://aayushgarg.dev/

Aayush_ander
garg-aayush
aayush-garg-8b26a734

AI & ML interests

None yet

Organizations

garg-aayush 's collections 4

LLM Tech Reports

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 341
Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22, 2025 • 131
Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 24

Running on CPU Upgrade

Featured

3.2k

The Smol Training Playbook

📚

3.2k

The secrets to building world-class LLMs
Running

3.86k

The Ultra-Scale Playbook

🌌

3.86k

The ultimate guide to training LLM on large GPU Clusters
Running

Featured

1.35k

FineWeb: decanting the web for the finest text data at scale

🍷

1.35k

Explore and download the FineWeb web‑scale text dataset
Running

224

FineVision: Open Data is All You Need

📝

224

A new open-source dataset for training VLMs

Proximal Policy Optimization Algorithms

Paper • 1707.06347 • Published Jul 20, 2017 • 11
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 66
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 145
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22, 2025 • 452

Llama papers and reports

List of papers and reports related to llama models

Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 252

LLM Tech Reports

Qwen3 Technical Report

Paper • 2505.09388 • Published May 14, 2025 • 341
Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22, 2025 • 131
Training language models to follow instructions with human feedback

Paper • 2203.02155 • Published Mar 4, 2022 • 24

Proximal Policy Optimization Algorithms

Paper • 1707.06347 • Published Jul 20, 2017 • 11
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 66
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 145
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22, 2025 • 452

Running on CPU Upgrade

Featured

3.2k

The Smol Training Playbook

📚

3.2k

The secrets to building world-class LLMs
Running

3.86k

The Ultra-Scale Playbook

🌌

3.86k

The ultimate guide to training LLM on large GPU Clusters
Running

Featured

1.35k

FineWeb: decanting the web for the finest text data at scale

🍷

1.35k

Explore and download the FineWeb web‑scale text dataset
Running

224

FineVision: Open Data is All You Need

📝

224

A new open-source dataset for training VLMs

Llama papers and reports

List of papers and reports related to llama models

Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 252

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs