Google Gemini Plans Pricing
Gemini API is offered in three service tiers - Free (limited models, no charge), Paid (higher rate limits, context caching, Batch API at 50% off), and Enterprise (custom security, provisioned throughput, volume discounts). Per-model pricing is per-million tokens with separate input and output rates and multi-band pricing for very long prompts. Multimodal inputs (audio) are priced higher than text/image/video.
Plans
Free access to a subset of Gemini models (e.g., Gemini 3.1 Flash-Lite Preview, 2.5 Flash) with input and output tokens at no charge. Subject to lower per-minute rate limits than paid tier.
- Gemini 3.1 Flash-Lite Preview (free models)
- Gemini 2.5 Flash (limited)
- Lower rate limits
- Best-effort availability
Per-token pricing for Gemini 3.1 Pro Preview with banded pricing for prompts above 200K tokens.
- Gemini 3.1 Pro (Preview)
- Long-context banded pricing
- Context caching
- Batch API (50% off)
Per-token pricing for Gemini 3.1 Flash-Lite Preview. Audio inputs priced higher than text/image/video.
- Gemini 3.1 Flash-Lite (Preview)
- Multimodal input (text/image/video/audio)
- Batch API (50% off)
Per-token pricing for Gemini 2.5 Flash. Audio inputs priced higher than text/image/video.
- Gemini 2.5 Flash
- Multimodal input
- Batch API (50% off)
Per-token pricing for Gemini 2.5 Flash-Lite. Lowest list price for cost-sensitive workloads.
- Gemini 2.5 Flash-Lite
- Lowest cost tier
- Batch API (50% off)
Per-search pricing for Google Search grounding requests. First 5,000 prompts/month are free (shared across Gemini 3 models).
- Google Search grounding for Gemini 3
- Free 5K prompts/month shared across Gemini 3
Enterprise SKU sold via Google Cloud / Vertex AI with provisioned throughput, custom security, VPC-SC, customer-managed encryption keys, and volume discount commitments. Pricing custom.
- Provisioned throughput
- VPC-SC + CMEK + data residency controls
- Volume / commit discounts
- Vertex AI integration