Technology RadarTechnology Radar
Trial

Google Gemma is a family of open-weight models spanning 270M to 27B parameters with variants optimized for on-device/edge deployment (Gemma 3n with LiteRT), scientific domains (MedGemma for medical reasoning), function calling, and code generation, released under the Gemma License with strong adoption signals (Gemma-3-1B-IT: 2.2M downloads).

Why It's in Trial

Gemma earns Trial through breadth, specialization, and device-centric architecture:

  • On-device focus: Gemma 3n (enhanced with LiteRT) targets sub-billion parameter deployment on mobile/IoT with 2M context window support
  • Ecosystem breadth: Specialized variants for medical AI (MedGemma-27B for clinical reasoning), function calling (FunctionGemma), privacy (VaultGemma with differential privacy), and code
  • Scale coverage: 270M to 27B parameters -- enables everything from edge devices to moderate-scale servers
  • High adoption: Gemma-3-1B-IT has 2.2M downloads; Gemma-3-270M has 823K downloads with 998 likes
  • Integration ready: 90+ community spaces on Hugging Face; TFLite, JAX, and PyTorch support

Positioned in Trial rather than Adopt because: Gemma is newer to the open-weight market compared to Llama, and the specialized variants (MedGemma, VaultGemma) are niche; the general-purpose base models lack the frontier performance of GLM-5 or Llama 4 on coding benchmarks.

Gemma Family Structure

Variant Parameters Release Use Case
Gemma 3 270M, 1B, 2B, 8B, 27B Aug 2025 General-purpose, base + instruct
Gemma 3n 2025 On-device with LiteRT, 2M context
MedGemma-27B 27B May 2025 Medical/clinical reasoning (specialized)
CodeGemma 7B Mar 2024 Code generation
FunctionGemma-270M 270M Oct 2025 Function calling on-device
VaultGemma-1B 1B Sep 2025 Privacy-preserving (differential privacy)

On-Device Deployment

Gemma 3n targets the emerging "edge AI" market with LiteRT (TensorFlow Lite) optimization:

  • Native TFLite export — direct deployment to Android/iOS without conversion overhead
  • 2M token context on smaller parameters (270M, 1B) -- unusually long for on-device models
  • Sub-1B variants for IoT -- resource-constrained edge scenarios (thermostats, wearables, sensors)
  • Quantization support -- further size reduction for storage-constrained environments

Benchmark Context

Gemma models trail frontier proprietary and leading open-weight models on coding benchmarks, but lead within the <10B parameter tier:

Benchmark Gemma-3-8B Qwen 2.5-7B Llama-3.1-8B
HumanEval ~70% (est.) ~85% ~82%
LiveCodeBench <20% (est.) ~20% ~15%
SWE-bench Verified <10% (est.) ~10% <5%

Note: Official benchmarks for Gemma 3 general-purpose models on coding tasks are limited; focus is on on-device efficiency and specialized domains.

Licensing & Commercial Use

  • Gemma License — custom Google license; requires acceptance
  • Exceptions: VaultGemma uses Apache 2.0 (differential privacy research)
  • Fully self-hostable; weights on Hugging Face
  • Commercial use explicitly permitted

Key Characteristics

Property Value
Parameter range 270M to 27B
Latest generation Gemma 3 (Aug 2025)
Primary focus On-device, edge, specialized domains
License Gemma License (custom)
Provider Google
Weights Hugging Face: google
On-device runtime TFLite, JAX, PyTorch

When to Choose Gemma

  • Mobile/IoT applications: Gemma 3n with LiteRT for sub-billion deployments
  • Medical AI: MedGemma for clinical decision support or medical research
  • Privacy-first: VaultGemma for differential privacy requirements
  • Function calling: FunctionGemma-270M for structured output on edge
  • Cost-sensitive inference: 270M/1B variants for extreme scale scenarios

Further Reading