Glossary

What is large language model?

A large language model (LLM) is an AI system trained on vast amounts of text to predict the next piece of text, which lets it generate, summarise, classify, and reason over language.

← All glossary terms

A large language model, or LLM, is a neural network trained on enormous quantities of text to predict the next token — the next chunk of text — given everything before it. From that single objective, scaled across billions of parameters and trillions of words, emerges a system that can write, summarise, translate, classify, extract structure, answer questions, and follow instructions. The "large" is doing real work: capabilities that don't appear in small models emerge as scale increases.

Under the hood, an LLM converts text into tokens, represents each as a vector, and passes them through many layers of a transformer architecture that lets the model weigh how each token relates to every other. The result is a probability distribution over what comes next, sampled to produce output one token at a time. After pretraining, most production LLMs are further shaped by instruction tuning and human-feedback training so they follow directions and behave helpfully rather than merely continuing text.

In production, LLMs are the engine behind chat assistants, document and email drafting, classification and extraction at scale, code generation, and the reasoning core of agents and RAG systems. They are accessed either through hosted APIs or by running open-weight models on your own infrastructure — a choice that trades off control, cost, latency, and data residency. Crucially, an LLM is a component, not a product: the value comes from wrapping it with retrieval, tools, guardrails, and evaluation that turn raw capability into something dependable.

LLMs matter because they generalise — one model handles a huge range of language tasks that previously needed bespoke systems, which collapses the cost of building language-aware features. But the same generality is the catch: an LLM is a probabilistic system that can be confidently wrong, is sensitive to how it's prompted, and has no inherent knowledge of your private data or of events after its training cutoff. Getting durable value out of one is an engineering discipline — grounding it in your data, constraining its behaviour, and measuring it — not a matter of access alone.

From definition to deployment

Understanding the term is step one. Bring us the problem and we'll build the system that solves it — and prove it moved the number.

Start a conversation

See our work