Parasail · Pricing Plans

Parasail Plans Pricing

Parasail offers four commercial surfaces — Serverless, Dedicated Serverless, Dedicated, and Batch — billed on a pay-per-token or GPU-hour basis with no long-term contracts. Tiers (Free, User, Dedicated Serverless, Dedicated Serverless Pro, Enterprise) gate request-per-minute capacity. Free credits are provided for new accounts.

Parasail Plans Pricing is the machine-readable pricing-plan profile for Parasail on the APIs.io network, conforming to the API Commons Plans specification.

It defines 7 plans, with named plans including Free, User, Dedicated Serverless, Dedicated Serverless Pro, Dedicated, and 2 more.

Tagged areas include AI, Artificial Intelligence, GPU, Inference, and Large Language Models.

7 Plans
View Source
AIArtificial IntelligenceGPUInferenceLarge Language ModelsOpen Source ModelsHugging FaceBatchEmbeddingsTokenmaxxingSupercloud

Plans

Free

Free tier with starter credits for evaluating Parasail's serverless inference.

User (user · month) 0 USD
User

Standard pay-per-token serverless tier for individual developers and small teams.

User (user · month) PayPerToken USD
Dedicated Serverless

Guaranteed throughput against a chosen model on isolated capacity, still billed per token but with reserved GPUs behind the endpoint.

Deployment (deployment · month) Reserved USD
Dedicated Serverless Pro

Higher-throughput dedicated serverless tier for production workloads.

Deployment (deployment · month) Reserved USD
Dedicated

Fully reserved GPU deployments billed on GPU-hours. Bring any Hugging Face or custom model and choose the device SKU and replica count.

GPU (gpu-hour · hour) GPUHour USD
Batch

Asynchronous batch inference for offline workloads at 50% off serverless rates.

Tokens (Input + Output) (token · usage) 50PctOffServerless USD
Enterprise

Custom contracts for large-scale tokenmaxxing customers.

Contract (contract · year) Call USD