Muse Spark, launched April 8, 2026 by Meta Superintelligence Labs (MSL), is Meta's first proprietary closed-weights frontier model — a ground-up rebuild (not a Llama iteration) that puts Meta 4th on the Artificial Analysis Intelligence Index v4.0 (score: 52, behind Gemini 3.1 Pro and GPT-5.4 at 57, Claude Opus 4.6 at 53). Notably, it leads all frontier models on HealthBench Hard (42.8% vs GPT-5.4's 40.1%).
Why It's in Trial
Muse Spark is Meta's re-entry into the closed-model frontier after a multi-year bet on open weights (Llama 1-4). Trial rather than Assess because:
- It is generally available today in Meta AI (meta.ai, Facebook, Instagram, WhatsApp, Messenger)
- Meta Superintelligence Labs is a well-resourced org with direct access to Meta's data moat (social + multimodal)
- Independent benchmarks place it firmly in the frontier tier — not just vendor claims
- HealthBench Hard performance is a meaningful signal for enterprise and research use cases (medical, scientific reasoning)
Trial rather than Adopt because the model is very new (April 2026), has no track record in production agentic workflows, and the shift away from open weights is a notable strategic reversal that introduces long-term supply risk.
What's New About It
Muse Spark (internally codenamed "Avocado") is a ground-up rebuild:
- New architecture: Not derived from Llama 4; new model family, new infrastructure, new data pipelines
- Multi-agent orchestration native: Designed from the start to coordinate multiple agents reasoning in parallel, synthesising their outputs into a single coherent response
- Multimodal-first: Built to integrate visual information across domains — STEM diagrams, entity recognition, spatial localization — as a first-class capability, not a retrofit
- Closed weights: First Meta model not released as open weights. Meta says it "hopes to open-source future versions" — a notable hedge from a company that built its AI reputation on open releases
Benchmark Performance
| Benchmark | Muse Spark | GPT-5.4 | Gemini 3.1 Pro | Claude Opus 4.6 |
|---|---|---|---|---|
| Artificial Analysis Intelligence Index v4.0 | 52 | 57 | 57 | 53 |
| HealthBench Hard | 42.8% | 40.1% | 20.6% | — |
HealthBench Hard is a clinical and biomedical reasoning benchmark — Muse Spark's 42.8% is a large margin above GPT-5.4 and a striking gap above Gemini 3.1 Pro. This is likely where Meta's social-network-scale health data gives the model an edge. Note: the Artificial Analysis article references HLE (Humanity's Last Exam, 39.9%) rather than HealthBench Hard; the HealthBench Hard figures above have not been confirmed from a primary source.
No SWE-bench Verified score has been reported; the initial benchmark disclosure focused on intelligence index and medical reasoning rather than coding.
The Open-Weights Reversal
Meta's Llama series (1 through 4, April 2025) established Meta as the default open-weights frontier lab. Muse Spark breaks from this entirely:
- No weights download
- No commercial open-weight license
- No API access outside the Meta AI app ecosystem (no direct API for developers at launch)
This matters for teams that built on Llama: future Meta frontier-tier models may not be open. Llama 4 remains the last confirmed open-weight Meta frontier release.
Access Today
Muse Spark is accessible through consumer surfaces, not developer APIs:
- meta.ai — web interface
- Meta AI app — mobile
- Facebook, Instagram, WhatsApp, Messenger — rolling out over weeks
There is no announced API for direct developer access. Teams that want Muse Spark in their applications must use Meta AI's integration surfaces or wait for an announced API.
Key Characteristics
| Property | Value |
|---|---|
| Provider | Meta (Meta Superintelligence Labs) |
| License | Proprietary |
| Pricing | Free through Meta AI surfaces; no API pricing announced |
| Context window | Not publicly disclosed |
| Parameters | Not publicly disclosed |
| Architecture | New (not Llama family; internal codename "Avocado") |
| Status | GA — Meta AI consumer surfaces; no developer API yet |
| Website | meta.ai |