Technology RadarTechnology Radar

pgvector with Spring AI

vector-databaseragspring-ai
Trial

pgvector extends PostgreSQL with vector similarity search — and Spring AI's PgVectorStore makes it the lowest-friction vector database for Java teams already running Postgres in production.

Why It's in Trial

If you already have Postgres in production, pgvector is the obvious first vector store to reach for. You don't need a new database, new operational knowledge, or a new billing account — you extend the database you already trust.

Spring AI's PgVectorStore wraps it behind the same VectorStore interface used by all other backends (Redis, Chroma, Weaviate, Pinecone), so switching later requires changing one bean definition and zero application code.

Setup with Spring AI

Dependencies:

<dependency>
  <groupId>org.springframework.ai</groupId>
  <artifactId>spring-ai-pgvector-store-spring-boot-starter</artifactId>
</dependency>

Database: Enable the extension and run the schema initialiser (Spring AI auto-creates tables on startup):

CREATE EXTENSION IF NOT EXISTS vector;

Configuration:

spring.ai.vectorstore.pgvector.index-type=HNSW
spring.ai.vectorstore.pgvector.distance-type=COSINE_DISTANCE
spring.ai.vectorstore.pgvector.dimensions=1536

Usage — store and retrieve:

@Autowired VectorStore vectorStore;
@Autowired EmbeddingModel embeddingModel;

// Ingest
vectorStore.add(List.of(
    new Document("Spring AI supports pgvector natively."),
    new Document("pgvector adds vector similarity search to PostgreSQL.")
));

// Retrieve by semantic similarity
List<Document> results = vectorStore.similaritySearch(
    SearchRequest.query("What databases does Spring AI support?")
                 .withTopK(5)
                 .withSimilarityThreshold(0.7)
);

Metadata filtering (Spring AI's portable filter expression):

SearchRequest.query("refund policy")
    .withFilterExpression("category == 'legal' && year >= 2024");

When pgvector Is Not Enough

pgvector is excellent for tens of millions of vectors with HNSW indexing. Beyond that — or when you need multi-tenancy, managed scaling, or filtering at billion-vector scale — consider dedicated vector databases:

Scale Option
< 100M vectors, existing Postgres pgvector (start here)
100M–1B, need managed service Pinecone, Weaviate
Self-hosted at scale Qdrant
Existing Redis stack Redis Vector Store

All of these are supported by Spring AI's VectorStore abstraction — switching is a one-bean change.

Testcontainers for Local Development

@Testcontainers
class RagServiceTest {
    @Container
    static PostgreSQLContainer<?> postgres = new PostgreSQLContainer<>("pgvector/pgvector:pg16")
        .withInitScript("schema.sql");

    // Spring Boot automatically picks up the Testcontainers-managed datasource
}

No external dependency for development or CI — the pgvector Docker image contains both Postgres and the extension.

Key Characteristics

Property Value
Requires PostgreSQL 14+ with vector extension
Spring AI support PgVectorStore (auto-configured)
Index types HNSW (recommended), IVFFlat
Local dev pgvector/pgvector:pg16 Docker image