Uber AI Dev Platform

Mar 2026

Assess

Full deep dive: Uber AI Developer Platform Architecture Breakdown

Uber has built one of the most comprehensive AI developer productivity suites in the industry — a constellation of specialized agents covering code generation, code review, migrations, and test generation. With 84% of developers using agentic coding tools and 11% of PRs opened by agents, Uber's approach demonstrates the power of domain-specific agents over general-purpose ones.

The Agent Suite

Unlike Stripe (one system, "Minions") or Spotify (one agent, "Honk"), Uber has built multiple purpose-built agents, each targeting a specific part of the development lifecycle.

Minion (Background Agent Platform)

Uber's internal background agent platform with full monorepo access. Engineers submit prompts via web, Slack, or CLI; the system generates code changes and opens PRs automatically.

1,800 code changes weekly (up from <1% to 8% of all changes)
Runs in isolated environments with monorepo access
Routes tasks to specialized sub-agents based on task type

uReview (AI Code Review)

A multi-stage GenAI review system analyzing 90% of ~65,000 weekly diffs.

The architecture uses prompt chaining — breaking review into four sequential sub-tasks:

Comment generation — Analyze the diff and produce review comments
Filtering — Remove low-value or noisy comments
Validation — Check comments for accuracy
Deduplication — Eliminate redundant feedback

Results: 75% of comments marked useful by engineers, 65% addressed.

uReview uses a pluggable assistant framework where each assistant focuses on a specific issue class (security, performance, style, etc.), rather than trying to review everything in one pass.

Shepherd (Migration Agent)

Manages large-scale migrations end-to-end — dependency upgrades, API transitions, framework changes across the monorepo. This is the type of work that traditionally required dedicated migration teams working for months.

AutoCover (Test Generation)

An autonomous test generation agent that raised platform coverage by ~10%, equivalent to 21,000 developer hours saved. Generates thousands of tests monthly, targeting uncovered code paths.

Key Design Principles

Uber's engineering team has documented several principles from their agent deployments:

Domain-specific agents beat general-purpose agents. Each agent is purpose-built for its task, with curated tools and prompts.
Compose LLM agents with deterministic sub-agents. Not everything needs AI — mix AI reasoning with rule-based logic where possible.
Bottom-up adoption works better than top-down mandates. "Quiet experimentation" by teams proved more effective than company-wide rollouts.

Cost Reality

Uber has been transparent about costs: AI infrastructure costs are up 6x since 2024. This is a useful counterpoint to the productivity gains — agent-generated code isn't free, and organizations need to budget accordingly.

Why It's in Assess

Uber's multi-agent approach is the most comprehensive public example of specialized AI agents across the full SDLC. The uReview prompt-chaining architecture and AutoCover's coverage impact are particularly well-documented and transferable. However, this suite is deeply integrated with Uber's monorepo and internal infrastructure, and the 6x cost increase is a real consideration. Assess the domain-specific agent pattern and the prompt-chaining approach for code review — these are the most immediately applicable ideas.

Key Characteristics

Property	Value
Company	Uber
System	Minion, uReview, Shepherd, AutoCover
Architecture	Suite of domain-specific agents
Developer adoption	84% using agentic coding tools
Agent PR share	11% of all PRs
Code review coverage	90% of 65K weekly diffs (uReview)
Test impact	+10% coverage, 21K dev hours saved (AutoCover)
Key innovation	Domain-specific agents + prompt-chaining for code review
Open source	No (internal systems)
Sources	Pragmatic Engineer, uReview - Uber Blog