A vector database is a store built to hold embeddings — the numerical representations of text, images, or other data produced by an AI model — and to find, very quickly, the stored vectors most similar to a given query vector. Where a traditional database answers "find rows where this column equals that value," a vector database answers "find the items whose meaning is closest to this," which is the core operation behind semantic search and retrieval-augmented generation.
The key capability is approximate nearest-neighbour search. Comparing a query against millions of stored vectors exactly would be too slow, so vector databases use specialised indexes (such as HNSW graphs) that find the closest matches in milliseconds with a tunable trade-off between speed and accuracy. They also handle the practical surroundings of retrieval — storing the original text alongside each vector, filtering results by metadata (date, source, permissions), and keeping the index fresh as documents are added, updated, or removed.
In production, a vector database is the retrieval layer of most RAG systems and semantic-search features. The pipeline runs: embed your documents, store the vectors, then at query time embed the question and ask the database for the most similar chunks to feed the model. The same engine powers recommendation, deduplication, and clustering. Options range from dedicated services to vector extensions of existing databases (so you can keep vectors next to your relational data) — and the right choice depends on scale, latency targets, and how much of your data already lives in one place.
A vector database matters because retrieval quality determines answer quality: if the database returns the wrong chunks, the model produces confident, wrong answers no matter how good it is. It also matters operationally — keeping the index in sync with source data, enforcing access controls so users only retrieve what they're allowed to see, and tuning the speed-accuracy trade-off are all real engineering concerns. The database is not a commodity bolt-on; it is the part of a RAG system that most often decides whether the whole thing works.