Ollama · Pricing Plans

Ollama Plans Pricing

Ollama is free and open-source for local inference (no auth, no charge, no rate limit on http://localhost:11434). Ollama Cloud is a hosted-inference add-on with three tiers (Free, Pro, Max), measured by GPU utilization rather than tokens. Cloud usage resets on a 5-hour session and 7-day weekly cycle. All tiers permit unlimited use of open models on the user's own hardware; tier differences apply only to ollama.com cloud-model concurrency and weekly cloud usage.

4 Plans API Commons Plans
View Source
Artificial IntelligenceLarge Language ModelsModels

Plans

Local freemium

Run Ollama on your own hardware. Open source. No account or API key required for the local server at http://localhost:11434.

Local inference (request · usage) 0.00 USD
Cloud — Free freemium

Free hosted inference at ollama.com. Requires an ollama.com account and API key. Cloud usage is metered by GPU time, not tokens.

Cloud monthly fee (month · month) 0.00 USD
Cloud GPU usage (basic limits, 5-hour session / 7-day weekly cycle) (gpu_time · week) 0.00 USD
Cloud — Pro subscription

$20/month or $200/year. Run 3 cloud models at a time; 50x more cloud usage than Free.

Monthly subscription (month · month) 20.00 USD
Annual subscription (year · year) 200.00 USD
Concurrent cloud models (concurrent_models · usage) included USD
Cloud — Max subscription

$100/month. Run 10 cloud models at a time; 5x more usage than Pro.

Monthly subscription (month · month) 100.00 USD
Concurrent cloud models (concurrent_models · usage) included USD

Sources