GPT-5.5, released April 23, 2026, is the first fully retrained OpenAI base model since GPT-4.5 — a ground-up architecture rebuild with native unified multimodal support (text, image, audio, video), co-designed with NVIDIA's GB200/GB300 NVL72 rack-scale systems, and stronger agentic benchmark performance than its predecessor.
Why It's in Trial
GPT-5.5 enters the radar in Trial because most benchmark scores are self-reported by OpenAI. SWE-bench Verified secondary-source consensus has converged to 82.6% across three independent trackers (vals.ai, llm-stats.com, BenchLM) — lower than Claude Opus 4.7's 87.6% on the same benchmark. SWE-bench Pro at 58.6% is confirmed via Scale AI's leaderboard context and is lower than Claude Opus 4.7 (64.3%), suggesting specialization: GPT-5.5 leads on agentic terminal tasks (Terminal-Bench 2.0: 82.7%) while Opus 4.7 leads on complex real-world coding. Move to Adopt once SWE-bench Verified is independently reproduced and LMSYS Arena rankings are confirmed.
Key upgrade signals from GPT-5.4:
- Architecture rebuilt from scratch — not an incremental fine-tune; the first full pretraining overhaul since GPT-4.5
- Unified multimodal pipeline — text, images, audio, and video processed end-to-end in a single model (GPT-5.4 had modality-specific sub-systems)
- NVIDIA silicon co-design — optimised for GB200 and GB300 NVL72 at the hardware level; expected to benefit latency and cost at scale
- Higher token efficiency — OpenAI reports fewer tokens needed for equivalent quality vs. GPT-5.4, partially offsetting the 2× price increase on the standard tier
- Stronger agentic task completion — OSWorld-Verified (computer use) up to 78.7%; Tau2-bench Telecom at 98.0%
Performance (Self-Reported)
| Benchmark | Score | Notes |
|---|---|---|
| Terminal-Bench 2.0 | 82.7% | Agentic terminal task completion |
| FrontierMath (Lvl 1–3) | 51.7% | Graduate-level mathematics |
| FrontierMath (Lvl 4) | 35.4% | Olympiad/research-level maths |
| GDPval | 84.9% | Complex knowledge work across 44 occupations |
| OSWorld-Verified | 78.7% | Real computer environment operation |
| Tau2-bench Telecom | 98.0% | Complex customer-service agent workflows |
| SWE-bench Verified | 82.6% | Secondary sources only; 3 independent trackers converge; primary not yet confirmed |
| SWE-bench Pro | 58.6% | Lower than Claude Opus 4.7 (64.3%) |
All scores from OpenAI's April 23 announcement except SWE-bench Verified, which is reported by secondary sources. Three independent trackers now converge on 82.6%: vals.ai, llm-stats.com, and BenchLM. tokenmix.ai's 88.7% figure remains an outlier — likely a different evaluation harness or unverified self-report. Independent reproduction still pending. SWE-bench Pro score sourced from Scale AI leaderboard context.
GPT-5.5 Instant (May 5, 2026)
OpenAI released GPT-5.5 Instant on May 5, 2026, replacing GPT-5.3 Instant as the default ChatGPT model for all users. It is available in the API as chat-latest. Key improvements over GPT-5.3 Instant:
- 52.5% fewer hallucinated claims on high-stakes prompts (medicine, law, finance)
- 30.2% fewer words and 29.2% fewer lines — more concise, practical tone with fewer unsolicited emoji
- Enhanced personalization from past chats, files, and connected Gmail (rolling out to Plus/Pro first)
GPT-5.5 Instant is a separate, smaller model tuned for speed and conversation — distinct from the flagship GPT-5.5 covered in this entry. Paid users retain access to GPT-5.3 Instant for three months before it is retired.
Sources: OpenAI announcement, TechCrunch
Pricing
| Tier | Input | Cached Input | Output |
|---|---|---|---|
| Standard API | $5.00/M | $0.50/M | $30.00/M |
GPT-5.5 Pro pricing not yet confirmed in accessible sources.
Context Window
- 1M tokens via API (same as GPT-5.4)
- 400K tokens in Codex (reduced vs. GPT-5.4's full 1M in Codex)
Relationship to GPT-5.4
GPT-5.5 supersedes GPT-5.4 as OpenAI's flagship model. GPT-5.4 remains available and is the safer choice for workloads where:
- Budget is constrained (GPT-5.4 is 2× cheaper at standard tier)
- Independent benchmark validation is required before adoption
- Codex integrations depend on full 1M-token context (GPT-5.4 has 1M in Codex; GPT-5.5 caps at 400K)
Key Characteristics
| Property | Value |
|---|---|
| License | Proprietary |
| Provider | OpenAI |
| Context window | 1,000,000 tokens (API) / 400,000 tokens (Codex) |
| Pricing | $5.00/M input, $30.00/M output |
| API model ID | gpt-5.5 |
| Release date | April 23, 2026 |
| Modalities | Text, image, audio, video |
GPT-5.5 Instant (May 5, 2026)
OpenAI released GPT-5.5 Instant on May 5, 2026, as the new default model for ChatGPT, replacing GPT-5.3 Instant. It is an inference-optimised variant of GPT-5.5 targeting lower-latency chat use cases.
| Property | Value |
|---|---|
| ChatGPT default | Yes (replaces GPT-5.3 Instant from May 5) |
| API alias | chat-latest |
| Context window | 400K tokens |
| Pricing | $5.00/M input, $30.00/M output (batch/flex: $2.50/$15.00) |
Key quality improvements over GPT-5.3 Instant: 52.5% fewer hallucinated claims on high-stakes prompts (medicine, law, finance), and 37.3% fewer inaccurate claims on user-flagged challenging conversations. The chat-latest alias routes to whichever Instant model currently powers ChatGPT; the previous model remains addressable for paying API customers for 90 days after each swap.
Sources: OpenAI — GPT-5.5 Instant, TechCrunch