Prime Intellect Plans Pricing
Reconciled plans and pricing for Prime Intellect — GPU compute marketplace (on-demand hourly), hosted RL training (per-million-token input/output/training), inference (per-million-token input/output), sandboxes, and persistent disks. Source: primeintellect.ai and docs.primeintellect.ai.
Prime Intellect Plans Pricing is the machine-readable pricing-plan profile for Prime Intellect on the APIs.io network, conforming to the API Commons Plans specification.
It defines 6 plans, covering usage-based and quote-based tiers, with named plans including Compute Marketplace — On-Demand GPUs, Compute Marketplace — Liquid Reserved Clusters, Hosted Training — Qwen3.5 0.8B, Hosted Training — Qwen3.5 397B-A17B, Hosted Training — Additional Models, and 1 more.
Tagged areas include AI, GPU Compute, Reinforcement Learning, Inference, and Sandboxes.
Plans
Hourly on-demand GPU instances across 50+ providers. Single-node and multi-node (1-256 GPUs) with Infiniband, SLURM, and Kubernetes orchestration.
Custom-quoted reserved GPU clusters built by gathering competitive quotes from 50+ providers. Pricing is per-engagement; talk to sales.
Hosted RL post-training for the smallest Qwen3.5 model. Billed per million tokens across input, output, and training.
Hosted RL post-training for the largest available Qwen3.5 MoE. Billed per million tokens across input, output, and training.
Other supported base models include Qwen3.5 sizes 2B, 4B, 9B, 35B-A3B, 122B-A10B; Llama 1B/3B Instruct; NVIDIA Nemotron 30B/120B; and OpenAI gpt-oss 20B/120B. All billed per million tokens (input, output, training).
OpenAI-compatible inference at api.pinference.ai. Per-million-token pricing with usage and cost returned in every response.