ky-thuat Beginner
What is an AI API? How to use LLM APIs
How developers call AI models from code — how the Claude API, OpenAI API, and Gemini API work.
Updated: May 5, 2026 · 3 min read
API (Application Programming Interface), in the AI context, is how developers call AI models (Claude, GPT, Gemini…) directly from code instead of through a chat UI. It’s the way you build AI features into your own app or website.
Why use the API instead of the web app?
| Web (claude.ai, chatgpt.com) | API |
|---|---|
| For end users | For developers |
| Pay per subscription plan | Pay per token used |
| One user at a time | Thousands of parallel requests |
| Can’t be embedded into an app | Easy to integrate |
If you’re building a chatbot, automation, or analysis tool, you’ll need the API.
API call examples (Python)
Claude
from anthropic import Anthropic
client = Anthropic(api_key="sk-ant-...")
response = client.messages.create(
model="claude-sonnet-4-7",
max_tokens=1024,
messages=[{"role": "user", "content": "Explain RAG to a beginner"}]
)
print(response.content[0].text)
OpenAI
from openai import OpenAI
client = OpenAI(api_key="sk-...")
response = client.chat.completions.create(
model="gpt-5",
messages=[{"role": "user", "content": "Explain RAG to a beginner"}]
)
print(response.choices[0].message.content)
Gemini
from google import genai
client = genai.Client(api_key="...")
response = client.models.generate_content(
model="gemini-2.5-pro",
contents="Explain RAG to a beginner"
)
print(response.text)
The syntax differs but the concept is the same: send a message, receive a response.
API pricing for major providers (2026)
| Provider | Model | Input / 1M tokens | Output / 1M tokens |
|---|---|---|---|
| Anthropic | Claude Sonnet 4.7 | $3 | $15 |
| Anthropic | Claude Haiku 4.5 | $0.80 | $4 |
| Anthropic | Claude Opus 4.5 | $15 | $75 |
| OpenAI | GPT-5 | $2.50 | $10 |
| OpenAI | GPT-5 mini | $0.15 | $0.60 |
| Gemini 2.5 Pro | $1.25 | $5 | |
| Gemini 2.5 Flash | $0.10 | $0.40 |
The “small/flash” tier is 10-30× cheaper than the flagship tier — golden rule: use the smallest model that’s good enough.
Advanced API features
- Streaming: receive tokens piece by piece (great for chatbot UX)
- Function calling / Tool use: let the LLM call your functions — see Function Calling
- Structured output: force the LLM to return JSON matching a schema
- Vision: send images alongside text
- Caching: cache fixed prompts to cut costs by up to 90%
- Batch API: send 1000 requests at once for a 50% discount
Practical notes for international developers
- Payment: most providers require an international Visa/Mastercard. In some countries (for example, Vietnam) you may need a virtual card from your bank.
- Rate limits: new accounts start with low limits. Verify your phone number and add credit to move up tiers.
- Latency: an API call from Asia to us-east takes ~150-200ms. Anthropic and OpenAI offer Asia endpoints (Singapore, Tokyo) for higher tiers.
- Compliance: if you handle sensitive customer data, read the provider’s data policy carefully.
Wrappers / SDKs worth using
- LangChain — generalist framework supporting all providers (overkill for simple tasks)
- LlamaIndex — great for RAG
- Vercel AI SDK — best fit for TypeScript web apps
- LiteLLM — proxy across many providers behind one interface
Related
Tags
#api#developer#production