Technology RadarTechnology Radar
Trial

GPT-5.5, released April 23, 2026, is the first fully retrained OpenAI base model since GPT-4.5 — a ground-up architecture rebuild with native unified multimodal support (text, image, audio, video), co-designed with NVIDIA's GB200/GB300 NVL72 rack-scale systems, and stronger agentic benchmark performance than its predecessor.

Why It's in Trial

GPT-5.5 enters the radar in Trial because most benchmark scores are self-reported by OpenAI. SWE-bench Verified secondary-source consensus has converged to 82.6% across three independent trackers (vals.ai, llm-stats.com, BenchLM) — lower than Claude Opus 4.7's 87.6% on the same benchmark. SWE-bench Pro at 58.6% is confirmed via Scale AI's leaderboard context and is lower than Claude Opus 4.7 (64.3%), suggesting specialization: GPT-5.5 leads on agentic terminal tasks (Terminal-Bench 2.0: 82.7%) while Opus 4.7 leads on complex real-world coding. Move to Adopt once SWE-bench Verified is independently reproduced and LMSYS Arena rankings are confirmed.

Key upgrade signals from GPT-5.4:

  • Architecture rebuilt from scratch — not an incremental fine-tune; the first full pretraining overhaul since GPT-4.5
  • Unified multimodal pipeline — text, images, audio, and video processed end-to-end in a single model (GPT-5.4 had modality-specific sub-systems)
  • NVIDIA silicon co-design — optimised for GB200 and GB300 NVL72 at the hardware level; expected to benefit latency and cost at scale
  • Higher token efficiency — OpenAI reports fewer tokens needed for equivalent quality vs. GPT-5.4, partially offsetting the 2× price increase on the standard tier
  • Stronger agentic task completion — OSWorld-Verified (computer use) up to 78.7%; Tau2-bench Telecom at 98.0%

Performance (Self-Reported)

Benchmark Score Notes
Terminal-Bench 2.0 82.7% Agentic terminal task completion
FrontierMath (Lvl 1–3) 51.7% Graduate-level mathematics
FrontierMath (Lvl 4) 35.4% Olympiad/research-level maths
GDPval 84.9% Complex knowledge work across 44 occupations
OSWorld-Verified 78.7% Real computer environment operation
Tau2-bench Telecom 98.0% Complex customer-service agent workflows
SWE-bench Verified 82.6% Secondary sources only; 3 independent trackers converge; primary not yet confirmed
SWE-bench Pro 58.6% Lower than Claude Opus 4.7 (64.3%)

All scores from OpenAI's April 23 announcement except SWE-bench Verified, which is reported by secondary sources. Three independent trackers now converge on 82.6%: vals.ai, llm-stats.com, and BenchLM. tokenmix.ai's 88.7% figure remains an outlier — likely a different evaluation harness or unverified self-report. Independent reproduction still pending. SWE-bench Pro score sourced from Scale AI leaderboard context.

GPT-5.5 Instant (May 5, 2026)

OpenAI released GPT-5.5 Instant on May 5, 2026, replacing GPT-5.3 Instant as the default ChatGPT model for all users. It is available in the API as chat-latest. Key improvements over GPT-5.3 Instant:

  • 52.5% fewer hallucinated claims on high-stakes prompts (medicine, law, finance)
  • 30.2% fewer words and 29.2% fewer lines — more concise, practical tone with fewer unsolicited emoji
  • Enhanced personalization from past chats, files, and connected Gmail (rolling out to Plus/Pro first)

GPT-5.5 Instant is a separate, smaller model tuned for speed and conversation — distinct from the flagship GPT-5.5 covered in this entry. Paid users retain access to GPT-5.3 Instant for three months before it is retired.

Sources: OpenAI announcement, TechCrunch

Pricing

Tier Input Cached Input Output
Standard API $5.00/M $0.50/M $30.00/M

GPT-5.5 Pro pricing not yet confirmed in accessible sources.

Context Window

  • 1M tokens via API (same as GPT-5.4)
  • 400K tokens in Codex (reduced vs. GPT-5.4's full 1M in Codex)

Relationship to GPT-5.4

GPT-5.5 supersedes GPT-5.4 as OpenAI's flagship model. GPT-5.4 remains available and is the safer choice for workloads where:

  • Budget is constrained (GPT-5.4 is 2× cheaper at standard tier)
  • Independent benchmark validation is required before adoption
  • Codex integrations depend on full 1M-token context (GPT-5.4 has 1M in Codex; GPT-5.5 caps at 400K)

Key Characteristics

Property Value
License Proprietary
Provider OpenAI
Context window 1,000,000 tokens (API) / 400,000 tokens (Codex)
Pricing $5.00/M input, $30.00/M output
API model ID gpt-5.5
Release date April 23, 2026
Modalities Text, image, audio, video

GPT-5.5 Instant (May 5, 2026)

OpenAI released GPT-5.5 Instant on May 5, 2026, as the new default model for ChatGPT, replacing GPT-5.3 Instant. It is an inference-optimised variant of GPT-5.5 targeting lower-latency chat use cases.

Property Value
ChatGPT default Yes (replaces GPT-5.3 Instant from May 5)
API alias chat-latest
Context window 400K tokens
Pricing $5.00/M input, $30.00/M output (batch/flex: $2.50/$15.00)

Key quality improvements over GPT-5.3 Instant: 52.5% fewer hallucinated claims on high-stakes prompts (medicine, law, finance), and 37.3% fewer inaccurate claims on user-flagged challenging conversations. The chat-latest alias routes to whichever Instant model currently powers ChatGPT; the previous model remains addressable for paying API customers for 90 days after each swap.

Sources: OpenAI — GPT-5.5 Instant, TechCrunch

Further Reading