Devin

agent multi-agent

This item was not updated in last three versions of the Radar. Should it have appeared in one of the more recent editions, there is a good chance it remains pertinent. However, if the item dates back further, its relevance may have diminished and our current evaluation could vary. Regrettably, our capacity to consistently revisit items from past Radar editions is limited.

Mar 2026

Trial

Devin by Cognition Labs is an autonomous AI software engineer — capable of planning and executing multi-hour engineering tasks with minimal human input. It has matured significantly since its controversial 2024 debut and now warrants Trial.

Architecture Deep Dive → Devin Architecture Breakdown — cloud VM sandboxing, agent-native IDE design, Devin Search/Wiki knowledge layer, and how Cognition's acquisition of Windsurf changes the competitive picture.

Why It Moved from Assess to Trial

Devin generated enormous hype when announced in early 2024, followed by skepticism about its benchmark methodology. Since then:

Devin 2.0 price drop (April 2025): From $500/mo to $20/mo base + $2.25/ACU (Agent Compute Unit, ~15 min of active work). This removed the biggest barrier to trying it.
Enterprise adoption: Goldman Sachs deployed Devin across 12,000 engineers (July 2025). Santander and Nubank are also named customers.
Acquired Windsurf (December 2025, ~$250M): Cognition now owns both an autonomous agent and an AI IDE — strategically unique positioning.
$10.2B valuation (September 2025): Strong investor signal on the autonomous agent category.
Competition validates the category: Claude Code, GitHub Copilot Coding Agent, and OpenAI Codex all now offer similar capabilities — confirming the pattern works

Where Devin Excels

Long-running tasks requiring many sequential steps (browser + shell + code)
Well-defined, self-contained tasks with clear acceptance criteria
Tasks that need a sandboxed environment (Devin runs in a cloud workspace)

Limitations

Complex, ambiguous engineering tasks still require significant human guidance
Cost per task can be high for iterative work. At $2.25/ACU (~15 minutes), a task that takes 2 hours of agent compute costs ~$18. If the task fails and needs rework, you pay again. For comparison, Claude Code on API billing typically costs $1–5 per complex task (token-based), and GitHub Copilot's agent mode is included in the $19/seat subscription. Devin's model makes sense for well-defined tasks with high success rates, but becomes expensive for exploratory or iterative work.
Less flexible than open tools (Claude Code, Cline) for teams that want control
No model choice. You use Cognition's models. If Claude or GPT outperforms on your workload, you can't switch — unlike Cursor (multi-model), OpenHands (any model), or GitHub Copilot (multi-model).

Key Characteristics

Property	Value
Interface	Web (SaaS), Slack integration
Pricing	$20/mo base + $2.25/ACU (~15 min of work)
Provider	Cognition Labs
Website	cognition.ai

Devin

Why It Moved from Assess to Trial

Where Devin Excels

Limitations

Key Characteristics

Further Reading