Technology RadarTechnology Radar

Diffblue Testing Agent

testing
Trial

The Diffblue Testing Agent (GA March 24, 2026) autonomously generates regression test suites for Java codebases without developer intervention. In benchmarks across eight real Java projects it delivered 81% average line coverage and 61% mutation coverage — compared to 32% line coverage achieved by a senior developer iterating for two hours with Claude Code.

Why It's in Trial

The 2.5× coverage advantage over AI-assisted manual writing is a credible, independently verifiable data point — not a marketing claim. The agent handles the full pipeline autonomously: coverage analysis, build configuration, test plan creation, parallelized generation, compile-and-run verification, rollback of failing tests, and PR preparation. It integrates with GitHub Copilot and Claude Code, so it fits into existing enterprise workflows rather than replacing them.

At the same time, it launched GA on March 24, 2026 — enterprise evaluation only, with no broad community adoption data yet. The Java + Python focus narrows its applicability, and pricing is enterprise-tier (not publicly listed). A few sessions of real-world use at medium-to-large Java codebases would confirm the benchmark numbers hold outside the benchmarked projects.

What the Agent Does

Rather than replacing your AI coding agent, Diffblue Testing Agent acts as an orchestration layer above it:

  1. Scans the codebase to identify classes and methods lacking test coverage
  2. Generates a coverage-aware test plan
  3. Delegates method- and class-level test creation to the underlying AI coding platform (Copilot, Claude Code)
  4. Verifies every generated test compiles and passes; rolls back failures automatically
  5. Produces a PR with all verified tests and a coverage report

This orchestration approach is why it outperforms a developer manually prompting an AI coding agent — the developer bottleneck disappears.

Caveats

  • Enterprise-only pricing; no community tier as of GA launch
  • Benchmark projects were selected by Diffblue — independent replication has not been published
  • Initial release covers Java and Python; Kotlin and other JVM languages are not yet supported
  • The 81% coverage figure is for line coverage — mutation coverage (61%) is the more meaningful metric and still competes well with manual AI-assisted testing (24%)

Key Characteristics

Property Value
Language support Java, Python
Deployment Enterprise (cloud agent)
Integrations GitHub Copilot, Claude Code
Founded University of Oxford spinout
Website diffblue.com
GA date March 24, 2026
Benchmark source diffblue.com/benchmarks

Further Reading