TopDev
ky-thuat Intermediate

What is Training?

The process of teaching an AI model by showing it millions or billions of examples and adjusting its internal parameters.

Updated: May 5, 2026 · 2 min read

Training is the process of teaching an AI model to extract patterns from data — you show it millions or billions of examples, and every time it guesses wrong, you nudge its internal parameters slightly, repeating until it gets most things right.

The 3 stages of training an LLM

1. Pre-training

The model reads TRILLIONS of tokens (essentially the entire quality web + books + code).

  • Goal: predict the next token
  • The most expensive part: a few months × thousands of GPUs = $10-100M
  • Result: a model that “knows the language” and “general knowledge” but isn’t useful yet

2. Supervised Fine-Tuning (SFT)

The model is shown high-quality (prompt → sample answer) pairs.

  • Goal: teach the model to respond like an “assistant”
  • Data: tens to hundreds of thousands of human-written pairs
  • Cost: $100k-$1M

3. RLHF (or DPO)

Further refinement using human feedback on which answer is better.

  • Goal: teach the model to better match human preferences — helpful, safe, not sycophantic
  • See RLHF for details

Parameters

During training, the model learns by adjusting its “weights” — these are the parameters.

  • GPT-2: 1.5 billion parameters
  • GPT-4: ~1.7 trillion
  • Llama 3.3 70B: 70 billion

More parameters → “smarter,” but more memory and compute required.

What training costs

Resources

  • GPU/TPU cluster: thousands to tens of thousands of chips
  • Data: TB to PB of text
  • Electricity: training GPT-4 is estimated at ~50GWh (≈ a year of consumption for 5,000 homes)
  • Money: $10M - $1B+ for a frontier model

Time

  • Pre-training: 2-6 months
  • Fine-tuning: 1-4 weeks
  • RLHF: 2-8 weeks

→ This is why only a handful of companies (OpenAI, Anthropic, Google, Meta, xAI) can train frontier models.

Do you need to train your own model?

99% of the time, NO. Reasons:

  • Far too expensive
  • Requires deep expertise
  • Most use cases can be solved with prompting + RAG
  • When customization is needed → fine-tune an existing model (see Fine-tuning)

Only train from scratch if:

  • You’re a big lab with the budget
  • You need a model for a niche industry/language that doesn’t exist yet
  • You require absolute ownership of the model (e.g., military)
Tags
#training#llm