Gemini 2.5 Flash-Lite
Google's fastest, most cost-efficient model
2025-06-18

Gemini 2.5 Flash-Lite is Google's new, fastest, and most cost-efficient model in the 2.5 family. It offers higher quality and lower latency than previous Lite versions while still supporting a 1M token context window and tool use. Now in preview.
Gemini 2.5 Flash-Lite is Google’s latest addition to its hybrid reasoning model family, offering unmatched speed and cost-efficiency. Designed for high-volume, latency-sensitive tasks like translation and classification, it outperforms previous versions with lower latency and enhanced quality across coding, math, and reasoning benchmarks. The model supports a 1-million-token context window, multimodal input, and tool integration, including Google Search and code execution. Now available in preview via Google AI Studio and Vertex AI, it joins the stable releases of Gemini 2.5 Flash and Pro, which are already powering applications for companies like Snap and SmartBear. Ideal for developers, Flash-Lite balances performance and affordability, making it a compelling choice for scalable AI solutions.
API
Artificial Intelligence
Development