BentoML · Pricing Plans

Bentoml Plans Pricing

BentoCloud pricing tiers for the managed AI inference platform. The open-source BentoML framework is free under Apache 2.0. BentoCloud is billed based on actual compute consumed, metered per second with no charges for deployments scaled to zero.

Bentoml Plans Pricing is the machine-readable pricing-plan profile for BentoML on the APIs.io network, conforming to the API Commons Plans specification.

It defines 3 plans, with named plans including Starter, Scale, Enterprise.

Tagged areas include machine learning, model serving, inference, AI, and REST API.

3 Plans
View Source
machine learningmodel servinginferenceAIREST APIMLOpsdeploymentGPULLMBentoCloud

Plans

Starter

For small teams of developers who want to focus on building AI applications without managing infrastructure. Billed monthly based on total usage via credit card.

Scale

For teams requiring formal SLAs, cold-start guarantees, and uptime targets. Includes all Starter plan features with enhanced support commitments.

Enterprise

For teams that want to use BentoCloud in their own cloud or on-premises environment (BYOC — Bring Your Own Cloud). Tailored for organizations requiring data privacy, compliance, and full infrastructure control.