ky-thuat Intermediate

What is Training?

The process of teaching an AI model by showing it millions or billions of examples and adjusting its internal parameters.

Updated: May 5, 2026 · 2 min read

Training is the process of teaching an AI model to extract patterns from data — you show it millions or billions of examples, and every time it guesses wrong, you nudge its internal parameters slightly, repeating until it gets most things right.

The 3 stages of training an LLM

1. Pre-training

The model reads TRILLIONS of tokens (essentially the entire quality web + books + code).

Goal: predict the next token
The most expensive part: a few months × thousands of GPUs = $10-100M
Result: a model that “knows the language” and “general knowledge” but isn’t useful yet

2. Supervised Fine-Tuning (SFT)

The model is shown high-quality (prompt → sample answer) pairs.

Goal: teach the model to respond like an “assistant”
Data: tens to hundreds of thousands of human-written pairs
Cost: $100k-$1M

3. RLHF (or DPO)

Further refinement using human feedback on which answer is better.

Goal: teach the model to better match human preferences — helpful, safe, not sycophantic
See RLHF for details

Parameters

During training, the model learns by adjusting its “weights” — these are the parameters.

GPT-2: 1.5 billion parameters
GPT-4: ~1.7 trillion
Llama 3.3 70B: 70 billion

More parameters → “smarter,” but more memory and compute required.

What training costs

Resources

GPU/TPU cluster: thousands to tens of thousands of chips
Data: TB to PB of text
Electricity: training GPT-4 is estimated at ~50GWh (≈ a year of consumption for 5,000 homes)
Money: $10M - $1B+ for a frontier model

Time

Pre-training: 2-6 months
Fine-tuning: 1-4 weeks
RLHF: 2-8 weeks

→ This is why only a handful of companies (OpenAI, Anthropic, Google, Meta, xAI) can train frontier models.

Do you need to train your own model?

99% of the time, NO. Reasons:

Far too expensive
Requires deep expertise
Most use cases can be solved with prompting + RAG
When customization is needed → fine-tune an existing model (see Fine-tuning)

Only train from scratch if:

You’re a big lab with the budget
You need a model for a niche industry/language that doesn’t exist yet
You require absolute ownership of the model (e.g., military)

Inference — running a trained model
Fine-tuning
GPU