Gemma 3n

Run powerful multimodal AI right on your phone

2025-06-27

Gemma 3n
Gemma 3n is Google's new open model, optimized for on-device multimodal AI. Its novel MatFormer architecture enables powerful yet efficient models (like the 2B/4B variants) that can run locally on phones and laptops. Supports image, audio & video.
Gemma 3n is Google's latest open AI model, optimized for on-device multimodal tasks on phones and laptops. Built with the innovative MatFormer architecture, it efficiently runs image, audio, and video processing locally, offering both 2B and 4B parameter variants for flexibility. The model includes Per-Layer Embeddings (PLE) to save memory and KV Cache Sharing for faster processing of long inputs like streams. Its advanced audio encoder supports multilingual speech tasks, while the MobileNet-V5 vision encoder excels in edge-device performance. Gemma 3n integrates seamlessly with popular tools like Hugging Face and Ollama, encouraging developer innovation. Google also launched the Gemma 3n Impact Challenge, inviting creators to build impactful applications with $150,000 in prizes. This release marks a leap in accessible, powerful on-device AI.
Open Source Artificial Intelligence Development