What Are Large Language Models?
Part of LLM Fundamentals in the AI Engineering Foundations learning path.

Why LLMs Matter
Large language models are prediction engines trained on enormous text corpora. Their job is simple at inference time: given a sequence of tokens, predict the next most likely token.
That sounds narrow, but the emergent behavior is powerful. Summarization, code generation, question answering, extraction, translation, and reasoning-like output all come from repeating that next-token prediction loop under different constraints.
The Three Core Ideas
1. Tokens, Not Words
LLMs do not read text the way humans do. They operate on tokens, which are chunks of text created by a tokenizer.
1const input = "Playwright makes browser automation reliable."23const tokens = [4 "Play",5 "wright",6 " makes",7 " browser",8 " automation",9 " reliable",10 ".",11]Tokenization matters because cost, latency, and context limits are usually measured in tokens rather than characters or sentences.
2. Context Windows
The model only knows what fits inside its context window: the current prompt, system instructions, chat history, retrieved documents, and generated output so far.
If important information does not fit in context, the model cannot reliably use it. That is why chunking, retrieval, and prompt structure become architectural concerns in real systems.
3. Probability Over Certainty
An LLM does not fetch a hard-coded answer from a database. It samples from a probability distribution shaped by its training and your prompt.
The model is not "searching its memory" in the human sense. It is producing the statistically most plausible continuation based on the tokens it sees.
Training vs. Inference
Training is the expensive phase where the model learns patterns from large datasets. Inference is the runtime phase where your application sends prompts and receives outputs.
For most engineers building products, inference is the operational concern:
- prompt design controls behavior
- context design controls relevance
- model choice controls cost, latency, and quality
- evaluation controls trust
What Makes LLMs Useful in Products
| Capability | Typical Product Use |
|---|---|
| Generation | Drafting documentation, email, UI copy |
| Transformation | Summarization, rewriting, translation |
| Extraction | Pulling entities, dates, actions, and metadata |
| Classification | Routing tickets, tagging content, moderation |
Practical Engineering Implications
When you design with LLMs, think in systems terms instead of prompt-only terms.
- The prompt is just one layer of control.
- Context assembly is part of product architecture.
- Evaluation is mandatory because outputs are probabilistic.
- Guardrails matter when the system can take action or affect users.
Key Takeaways
- LLMs generate output token by token.
- Context window limits directly shape product behavior.
- Prompting works best when paired with retrieval, evaluation, and system design.
- Understanding the token-level model helps you make better engineering decisions upstream.