Testcontainers + Ollama for AI Integration Tests

Mar 2026

Trial

Running a real local LLM in your integration tests — via Testcontainers starting an Ollama container — gives you deterministic AI tests without mocking, rate limits, or API costs.

Why It's in Trial

Testing AI code is an unsolved problem for most teams. The options are:

Mock the LLM — fast, but mocks don't behave like real models; misses hallucination, formatting, and latency characteristics
Call real APIs in tests — works but incurs cost, hits rate limits, flaky in CI, needs credentials in CI
Testcontainers + Ollama — runs a real (small) model locally in a Docker container, deterministic enough for integration tests, no external dependencies

Option 3 is increasingly the right answer for testing the integration between your code and AI behaviour.

Setup

Dependency (Spring Boot test):

<dependency>
  <groupId>org.springframework.boot</groupId>
  <artifactId>spring-boot-testcontainers</artifactId>
  <scope>test</scope>
</dependency>
<dependency>
  <groupId>org.testcontainers</groupId>
  <artifactId>ollama</artifactId>
  <scope>test</scope>
</dependency>

Test class:

@SpringBootTest
@Testcontainers
class CustomerSupportServiceIT {

    @Container
    @ServiceConnection
    static OllamaContainer ollama = new OllamaContainer("ollama/ollama:latest")
        .withModel("llama3.2:3b");  // Small model — fast to pull, ~2GB

    @Autowired CustomerSupportService service;

    @Test
    void shouldSummariseComplaint() {
        String summary = service.summarise(
            "My order #12345 arrived damaged and I want a refund."
        );
        assertThat(summary).containsIgnoringCase("damaged");
        assertThat(summary).containsIgnoringCase("refund");
    }
}

The @ServiceConnection annotation wires the Ollama container URL directly into Spring AI's configuration — no manual property setup.

Quarkus Dev Services

Quarkus LangChain4j handles this automatically — when running tests, Quarkus Dev Services spins up an Ollama container and configures the application without any test annotations required. It's zero configuration.

Choosing a Test Model

Model	Size	When to use
`llama3.2:3b`	~2 GB	Fast CI tests, basic comprehension checks
`llama3.1:8b`	~5 GB	Better quality, acceptable CI time
`mistral:7b`	~4 GB	Good instruction following, small footprint
`nomic-embed-text`	~274 MB	Embedding-only tests (RAG pipelines)

For CI, a 3B model starts in ~10 seconds after the image is cached — comparable to a Postgres Testcontainer cold start.

Model Caching in CI

The first test run pulls the model (~2 GB), which is slow. Cache the pulled model in CI:

GitHub Actions: cache /root/.ollama or the container image layer
Most CI systems: use testcontainers.reuse.enable=true to keep the container running between test runs

Key Characteristics

Property	Value
Requires	Docker, Java 11+
Spring Boot integration	`@ServiceConnection` (Spring Boot 3.1+)
Quarkus integration	Automatic via Dev Services
CI caching	Cache Ollama model downloads