Testcontainers + Ollama for AI Integration Tests
testingRunning a real local LLM in your integration tests — via Testcontainers starting an Ollama container — gives you deterministic AI tests without mocking, rate limits, or API costs.
Why It's in Trial
Testing AI code is an unsolved problem for most teams. The options are:
- Mock the LLM — fast, but mocks don't behave like real models; misses hallucination, formatting, and latency characteristics
- Call real APIs in tests — works but incurs cost, hits rate limits, flaky in CI, needs credentials in CI
- Testcontainers + Ollama — runs a real (small) model locally in a Docker container, deterministic enough for integration tests, no external dependencies
Option 3 is increasingly the right answer for testing the integration between your code and AI behaviour.
Setup
Dependency (Spring Boot test):
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-testcontainers</artifactId>
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.testcontainers</groupId>
<artifactId>ollama</artifactId>
<scope>test</scope>
</dependency>
Test class:
@SpringBootTest
@Testcontainers
class CustomerSupportServiceIT {
@Container
@ServiceConnection
static OllamaContainer ollama = new OllamaContainer("ollama/ollama:latest")
.withModel("llama3.2:3b"); // Small model — fast to pull, ~2GB
@Autowired CustomerSupportService service;
@Test
void shouldSummariseComplaint() {
String summary = service.summarise(
"My order #12345 arrived damaged and I want a refund."
);
assertThat(summary).containsIgnoringCase("damaged");
assertThat(summary).containsIgnoringCase("refund");
}
}
The @ServiceConnection annotation wires the Ollama container URL directly into Spring AI's configuration — no manual property setup.
Quarkus Dev Services
Quarkus LangChain4j handles this automatically — when running tests, Quarkus Dev Services spins up an Ollama container and configures the application without any test annotations required. It's zero configuration.
Choosing a Test Model
| Model | Size | When to use |
|---|---|---|
llama3.2:3b |
~2 GB | Fast CI tests, basic comprehension checks |
llama3.1:8b |
~5 GB | Better quality, acceptable CI time |
mistral:7b |
~4 GB | Good instruction following, small footprint |
nomic-embed-text |
~274 MB | Embedding-only tests (RAG pipelines) |
For CI, a 3B model starts in ~10 seconds after the image is cached — comparable to a Postgres Testcontainer cold start.
Model Caching in CI
The first test run pulls the model (~2 GB), which is slow. Cache the pulled model in CI:
- GitHub Actions: cache
/root/.ollamaor the container image layer - Most CI systems: use
testcontainers.reuse.enable=trueto keep the container running between test runs
Key Characteristics
| Property | Value |
|---|---|
| Requires | Docker, Java 11+ |
| Spring Boot integration | @ServiceConnection (Spring Boot 3.1+) |
| Quarkus integration | Automatic via Dev Services |
| CI caching | Cache Ollama model downloads |