Google Gemma

Mar 2026

Trial

Google Gemma is a family of open-weight models spanning 270M to 27B parameters with variants optimized for on-device/edge deployment (Gemma 3n with LiteRT), scientific domains (MedGemma for medical reasoning), function calling, and code generation, released under the Gemma License with strong adoption signals (Gemma-3-1B-IT: 2.2M downloads).

Why It's in Trial

Gemma earns Trial through breadth, specialization, and device-centric architecture:

On-device focus: Gemma 3n (enhanced with LiteRT) targets sub-billion parameter deployment on mobile/IoT with 2M context window support
Ecosystem breadth: Specialized variants for medical AI (MedGemma-27B for clinical reasoning), function calling (FunctionGemma), privacy (VaultGemma with differential privacy), and code
Scale coverage: 270M to 27B parameters -- enables everything from edge devices to moderate-scale servers
High adoption: Gemma-3-1B-IT has 2.2M downloads; Gemma-3-270M has 823K downloads with 998 likes
Integration ready: 90+ community spaces on Hugging Face; TFLite, JAX, and PyTorch support

Positioned in Trial rather than Adopt because: Gemma is newer to the open-weight market compared to Llama, and the specialized variants (MedGemma, VaultGemma) are niche; the general-purpose base models lack the frontier performance of GLM-5 or Llama 4 on coding benchmarks.

Gemma Family Structure

Variant	Parameters	Release	Use Case
Gemma 3	270M, 1B, 2B, 8B, 27B	Aug 2025	General-purpose, base + instruct
Gemma 3n	—	2025	On-device with LiteRT, 2M context
MedGemma-27B	27B	May 2025	Medical/clinical reasoning (specialized)
CodeGemma	7B	Mar 2024	Code generation
FunctionGemma-270M	270M	Oct 2025	Function calling on-device
VaultGemma-1B	1B	Sep 2025	Privacy-preserving (differential privacy)

On-Device Deployment

Gemma 3n targets the emerging "edge AI" market with LiteRT (TensorFlow Lite) optimization:

Native TFLite export — direct deployment to Android/iOS without conversion overhead
2M token context on smaller parameters (270M, 1B) -- unusually long for on-device models
Sub-1B variants for IoT -- resource-constrained edge scenarios (thermostats, wearables, sensors)
Quantization support -- further size reduction for storage-constrained environments

Benchmark Context

Gemma models trail frontier proprietary and leading open-weight models on coding benchmarks, but lead within the <10B parameter tier:

Benchmark	Gemma-3-8B	Qwen 2.5-7B	Llama-3.1-8B
HumanEval	~70% (est.)	~85%	~82%
LiveCodeBench	<20% (est.)	~20%	~15%
SWE-bench Verified	<10% (est.)	~10%	<5%

Note: Official benchmarks for Gemma 3 general-purpose models on coding tasks are limited; focus is on on-device efficiency and specialized domains.

Licensing & Commercial Use

Gemma License — custom Google license; requires acceptance
Exceptions: VaultGemma uses Apache 2.0 (differential privacy research)
Fully self-hostable; weights on Hugging Face
Commercial use explicitly permitted

Key Characteristics

Property	Value
Parameter range	270M to 27B
Latest generation	Gemma 3 (Aug 2025)
Primary focus	On-device, edge, specialized domains
License	Gemma License (custom)
Provider	Google
Weights	Hugging Face: google
On-device runtime	TFLite, JAX, PyTorch

When to Choose Gemma

Mobile/IoT applications: Gemma 3n with LiteRT for sub-billion deployments
Medical AI: MedGemma for clinical decision support or medical research
Privacy-first: VaultGemma for differential privacy requirements
Function calling: FunctionGemma-270M for structured output on edge
Cost-sensitive inference: 270M/1B variants for extreme scale scenarios