Ollama · Pricing Plans

Ollama Plans Pricing

Ollama is free and open-source for local inference (no auth, no charge, no rate limit on http://localhost:11434). Ollama Cloud is a hosted-inference add-on with three tiers (Free, Pro, Max), measured by GPU utilization rather than tokens. Cloud usage resets on a 5-hour session and 7-day weekly cycle. All tiers permit unlimited use of open models on the user's own hardware; tier differences apply only to ollama.com cloud-model concurrency and weekly cloud usage.

4 Plans API Commons Plans

View Source

Artificial IntelligenceLarge Language ModelsModels

Plans

Local freemium

Run Ollama on your own hardware. Open source. No account or API key required for the local server at http://localhost:11434.

Local inference (request · usage) 0.00 USD

Unlimited local inference
Open-source models
No authentication required for localhost
Ollama API
Ollama OpenAI Compatibility API
Ollama Anthropic Compatibility API

Cloud — Free freemium

Free hosted inference at ollama.com. Requires an ollama.com account and API key. Cloud usage is metered by GPU time, not tokens.

Cloud monthly fee (month · month) 0.00 USD

Cloud GPU usage (basic limits, 5-hour session / 7-day weekly cycle) (gpu_time · week) 0.00 USD

Run cloud models from CLI/API
Unlimited public models
Run models on your own hardware
Ollama Cloud API

Cloud — Pro subscription

$20/month or $200/year. Run 3 cloud models at a time; 50x more cloud usage than Free.

Monthly subscription (month · month) 20.00 USD

Annual subscription (year · year) 200.00 USD

Concurrent cloud models (concurrent_models · usage) included USD

3 concurrent cloud models
50x cloud usage of Free
Ollama Cloud API

Cloud — Max subscription

$100/month. Run 10 cloud models at a time; 5x more usage than Pro.

Monthly subscription (month · month) 100.00 USD

Concurrent cloud models (concurrent_models · usage) included USD

10 concurrent cloud models
5x cloud usage of Pro
Ollama Cloud API

Ollama Plans Pricing

Plans

Sources