Google Unveils Gemma 4 Models, Enhancing Reasoning on Low-Power Devices

Mo. Basuony

Published: April 2, 2026 1:01 PM ET

Google Unveils Gemma 4 Models, Enhancing Reasoning on Low-Power Devices

Google has released Gemma 4, a new family of open-weights AI models aimed at local and edge deployments. Google Unveils Gemma 4 Models, Enhancing Reasoning on Low-Power Devices, researchers said.

Model family and sizes

Gemma 4 is offered in four variants. These are Effective 2B, Effective 4B, a 26B Mixture of Experts, and a 31B Dense model.

Effective 2B (E2B): Built for lightweight hardware and edge use.
Effective 4B (E4B): Targets phones and small single-board computers.
26B Mixture of Experts (MoE): Activates about 3.8 billion parameters during inference.
31B Dense: A larger dense model that ranks third on the Arena AI Text leaderboard.

Edge-first capabilities

The Gemma 4 family supports complex reasoning while remaining small enough for single-GPU use. That makes them suitable for smartphones, workstations, and Raspberry Pi devices.

All models handle image and video inputs. The E2B and E4B models also support native audio for on-device speech understanding.

Agent and developer features

Gemma 4 includes native function calling and structured JSON outputs. Developers can build autonomous agents that interact with tools and execute multi-step plans.

The 26B MoE design keeps latency low by activating a fraction of its parameters at runtime. That enables faster inference without sacrificing larger-model capabilities.

Context length and data handling

Google extended context windows across the lineup. Smaller models support up to 128K tokens, while the larger two reach 256K tokens.

These longer windows let developers upload full codebases or extensive document sets in a single prompt.

Licensing and availability

All Gemma 4 models are released under the Apache 2.0 license. That license reduces commercial restrictions compared to many other models.

The models are accessible through Google Cloud. Open weights are also available on Hugging Face, Kaggle, and Ollama.

Industry response

Google DeepMind researchers Clement Farabet and Olivier Lacombe said the family delivers more intelligence per parameter. They highlighted better performance for agentic and edge scenarios.

Analyst Holger Mueller of Constellation Research said the models strengthen Google’s position in local AI. He noted the importance of low latency and digital sovereignty for enterprise use cases.

Gemma 4 aims to broaden developer options for on-device AI. Its mix of permissive licensing, small-footprint variants, and agent-ready features may accelerate edge adoption. Filmogaz.com will monitor further evaluations and real-world deployments.