The real reason Google gave away Gemma 4
1 min read
Originally from youtube.com
View source
My notes
Summary
Google released Gemma 4 (April 2nd) under Apache 2.0, same research lineage as Gemini 3, full commercial license, no usage fees. The move is a developer-acquisition play: get builders prototyping locally on Gemma, then convert them to Google Cloud / Vertex AI when they go to production. Meta runs the same loop with Llama; the difference is Google monetises directly via cloud, not ads.
Key Insight
- Four sizes shipped: E2B and E4B (small, layer-per-token signal trick), 26B (mixture of experts), 31B (dense). E2B runs in under 1.5 GB RAM, smaller than most phone apps, while handling text, images, audio in 140 languages, fully offline.
- MoE math worth memorising: 26B model has 128 experts; per token only 8 fire. ~3.8B active parameters out of 26B in memory. Scores 1441 on Arena AI vs 1452 for the 31B dense (11-point gap, ~1/7 the compute cost). This is the production case for MoE made concrete.
- Apache 2.0 is the actual unlock, not benchmarks. Previous Gemma had a custom Google license that enterprise legal teams kept flagging. Apache 2.0 = no revenue caps, no user thresholds, no reporting back, fine-tune freely, ship commercial products, just keep the license text. This opens healthcare/fintech/government use cases where data cannot leave the building.
- Funnel logic: open-source = top of funnel, cloud revenue = conversion at bottom. Developer who builds on your model today is the customer paying your cloud bill tomorrow. Google losing the developer ecosystem is how they become irrelevant to the next generation, Gemma is defensive AND offensive.
- The local-AI quality gap closed. Llama and Mistral have run locally for 2+ years; what changed is that local quality now approaches cloud quality for many tasks. Cost of building serious AI products dropped sharply: validate locally on Gemma, only pay for cloud when revenue justifies it.
- Enterprise/regulated industries are the silent winner. A clean OSS license + on-prem hardware + multimodal + 140 languages reframes the compliance conversation entirely.