Google Gemma is a family of open-weight models spanning 270M to 27B parameters with variants optimized for on-device/edge deployment (Gemma 3n with LiteRT), scientific domains (MedGemma for medical reasoning), function calling, and code generation, released under the Gemma License with strong adoption signals (Gemma-3-1B-IT: 2.2M downloads).
Why It's in Trial
Gemma earns Trial through breadth, specialization, and device-centric architecture:
- On-device focus: Gemma 3n (enhanced with LiteRT) targets sub-billion parameter deployment on mobile/IoT with 2M context window support
- Ecosystem breadth: Specialized variants for medical AI (MedGemma-27B for clinical reasoning), function calling (FunctionGemma), privacy (VaultGemma with differential privacy), and code
- Scale coverage: 270M to 27B parameters -- enables everything from edge devices to moderate-scale servers
- High adoption: Gemma-3-1B-IT has 2.2M downloads; Gemma-3-270M has 823K downloads with 998 likes
- Integration ready: 90+ community spaces on Hugging Face; TFLite, JAX, and PyTorch support
Positioned in Trial rather than Adopt because: Gemma is newer to the open-weight market compared to Llama, and the specialized variants (MedGemma, VaultGemma) are niche; the general-purpose base models lack the frontier performance of GLM-5 or Llama 4 on coding benchmarks.
Gemma Family Structure
| Variant | Parameters | Release | Use Case |
|---|---|---|---|
| Gemma 3 | 270M, 1B, 2B, 8B, 27B | Aug 2025 | General-purpose, base + instruct |
| Gemma 3n | — | 2025 | On-device with LiteRT, 2M context |
| MedGemma-27B | 27B | May 2025 | Medical/clinical reasoning (specialized) |
| CodeGemma | 7B | Mar 2024 | Code generation |
| FunctionGemma-270M | 270M | Oct 2025 | Function calling on-device |
| VaultGemma-1B | 1B | Sep 2025 | Privacy-preserving (differential privacy) |
On-Device Deployment
Gemma 3n targets the emerging "edge AI" market with LiteRT (TensorFlow Lite) optimization:
- Native TFLite export — direct deployment to Android/iOS without conversion overhead
- 2M token context on smaller parameters (270M, 1B) -- unusually long for on-device models
- Sub-1B variants for IoT -- resource-constrained edge scenarios (thermostats, wearables, sensors)
- Quantization support -- further size reduction for storage-constrained environments
Benchmark Context
Gemma models trail frontier proprietary and leading open-weight models on coding benchmarks, but lead within the <10B parameter tier:
| Benchmark | Gemma-3-8B | Qwen 2.5-7B | Llama-3.1-8B |
|---|---|---|---|
| HumanEval | ~70% (est.) | ~85% | ~82% |
| LiveCodeBench | <20% (est.) | ~20% | ~15% |
| SWE-bench Verified | <10% (est.) | ~10% | <5% |
Note: Official benchmarks for Gemma 3 general-purpose models on coding tasks are limited; focus is on on-device efficiency and specialized domains.
Licensing & Commercial Use
- Gemma License — custom Google license; requires acceptance
- Exceptions: VaultGemma uses Apache 2.0 (differential privacy research)
- Fully self-hostable; weights on Hugging Face
- Commercial use explicitly permitted
Key Characteristics
| Property | Value |
|---|---|
| Parameter range | 270M to 27B |
| Latest generation | Gemma 3 (Aug 2025) |
| Primary focus | On-device, edge, specialized domains |
| License | Gemma License (custom) |
| Provider | |
| Weights | Hugging Face: google |
| On-device runtime | TFLite, JAX, PyTorch |
When to Choose Gemma
- Mobile/IoT applications: Gemma 3n with LiteRT for sub-billion deployments
- Medical AI: MedGemma for clinical decision support or medical research
- Privacy-first: VaultGemma for differential privacy requirements
- Function calling: FunctionGemma-270M for structured output on edge
- Cost-sensitive inference: 270M/1B variants for extreme scale scenarios