Skip to main content
AI Tool Radar

What is Embedding?

A numerical vector representation of text, images, or other data that captures semantic meaning, enabling similarity comparisons in high-dimensional space.

Full Definition

An embedding is a dense numerical vector — typically hundreds to thousands of floating-point numbers — that encodes the semantic meaning of an input (text, image, audio, or code) in a continuous high-dimensional space. The key property is that semantically similar inputs are mapped to geometrically close vectors, so that cosine similarity or dot product distance reflects meaning similarity. Embeddings are produced by encoder models (e.g., OpenAI's text-embedding-3-large, Cohere Embed, or open-source models like BGE) trained on large corpora with contrastive or masked-language-modeling objectives. Embeddings are foundational to virtually every modern AI information retrieval system: they power RAG pipelines, semantic search, recommendation systems, clustering, and the retrieval step in vector databases. The dimensionality of an embedding model trades off quality against storage and compute cost.

Tools that use Embedding

Related Terms