The Latent Space — How Vectors & Embeddings Work

Concepts 0 vectors

Embedding Mode

Add a word or phrase

Presets

Embedding Space

Vector Arithmetic

Similarity Matrix

RAG Search

Click + drag to pan. Scroll to zoom. Hover a point for details.

Vector Arithmetic

The most famous embedding discovery: you can do math with meaning. king - man + woman ≈ queen. The vector that points from “man” to “king” is roughly the same as the vector from “woman” to “queen” — both encode “royalty.”

Classic Analogies

+ Positive (add)

- Negative (subtract)

Enter words above or pick an analogy preset, then compute.

Add at least 2 concepts to see similarity scores.

How RAG retrieval works

Retrieval-Augmented Generation converts your query into a vector, then finds the closest document vectors. The top-k nearest results become context for the LLM.

top-k:

Add concepts above, then search to see ranked results with similarity scores.

How It Works

Text → Vector

An embedding model converts text into a fixed-length array of numbers (a vector). Each dimension captures some aspect of meaning.

"king" → [0.21, -0.45, 0.89, ...]

Real embeddings have 768–3072 dimensions. This demo uses a simplified hash-based projection into 2D so you can see the geometry.

Vector Math

king - man + woman = queen

Embedding spaces encode relationships as directions. The “gender” direction (man → woman) can be added to “king” to reach “queen.” This only works with real pre-trained vectors (GloVe or Transformers.js).

Cosine Similarity

Two vectors are “similar” if they point in roughly the same direction. Cosine similarity measures this: 1.0 = identical direction, 0 = orthogonal, -1.0 = opposite.

cos(θ) = (A · B) / (|A| × |B|)

Embedding Modes

Hash-based: Fast, deterministic, approximate. Good for seeing how the math works.

GloVe 50d: Real pre-trained word vectors from Stanford. Best for word analogies. Requires a one-time setup (generate-glove-json.js).

Transformers.js: Real neural embeddings (all-MiniLM-L6-v2, 384d) running in your browser via WebAssembly. Best for sentence-level similarity.

RAG Pipeline

Index: Embed all your documents into vectors, store in a vector DB (pgvector, Qdrant, Pinecone).
Query: Embed the user’s question.
Retrieve: Find the k nearest vectors.
Generate: Feed retrieved docs + question to the LLM.