Koyeb Serverless GPUs: Slashing Prices on A100, H100, and L40S by up To 24%

At Koyeb, we provide high-performance serverless infrastructure to run and scale dedicated GPU containers in production with zero configuration required. No ops, no servers, no infrastructure management. We automatically scale GPUs when needed, and down to zero when idle.

GPUs are expensive. Everyone knows it. We believe high-performance AI infrastructure should be accessible to everyone from teams running hundreds of GPUs who need a cost-effective way to scale based on different scaling criteria while keeping costs under control to AI builders who want occasional capacity to experiment, deploy, and fine-tune models.

Today, we’re excited to share that we’re dropping our prices by up to 24% on NVIDIA’s L40S, A100, and H100 GPUs, making high-performance AI workloads even more accessible on Koyeb.

Making GPU Power Accessible to Every Builder

GPUs are one of the biggest infrastructure costs in AI, often out of reach for many teams. At Koyeb, we’re changing that by making high-performance compute accessible to every builder and organization.

Our serverless GPU platform scales from zero to hundreds of replicas automatically, so you only pay for what you use. No idle costs, no infrastructure management.

Now, we’re going further. With reduced prices across our most popular NVIDIA GPUs, it’s easier than ever to experiment, fine-tune, and deploy AI workloads at scale, faster, more efficiently, and without fear of the bill.

Up to 32% More Compute on L40S, A100, and H100 GPUs

TL;DR: Lower prices. More compute. Same serverless experience.

Serverless GPU Instance	Price Per Hour	Price Reduction	VRAM	vCPU	RAM
L40S	~~$1.55/hour~~ $1.20/hour	23%	48 GB	15	64 GB
A100	~~$2.00/hour~~ $1.60/hour	20%	80 GB	15	180 GB
H100	~~$3.30/hour~~ $2.50/hour	24%	80 GB	15	180 GB

All GPU instances are billed by the second.

Deploy AI workloads on Koyeb

Run your AI workloads on high-performance serverless GPUs. Enjoy native autoscaling and Scale-to-Zero.

Deploy Now

More than Just a Price Drop, an Efficient and Scalable Serverless Platform

With this change, you can stretch your computing budget while keeping all the benefits of serverless:

Deploy containers from any registry using the Koyeb API, CLI, or control panel.
Scale-to-Zero with reactive autoscaling based on requests per second, concurrent connections, or P95 response time.
Pay only for what you use, per second.
Dedicated GPU performance without managing underlying infrastructure.
Built-in observability, metrics, and more.
One unified platform for all your AI workloads to deploy and scale across GPUs, CPUs, and accelerators.

Build fast. Experiment more. Scale on-demand.

Multi-GPU A100 and H100 Configurations

The price drop applies to multi-GPU configurations, including 2×, 4×, and 8× A100 and H100 setups. Whether you’re running inference or fine-tuning ML models, you can now leverage multi-GPU configurations at a fraction of the previous cost.

Combined with Scale-to-Zero and autoscaling, these lower prices on H100 and A100 serverless GPUs mean you get huge gains in efficiency and more compute for less.

Serverless GPU Instance	Price Per Hour	VRAM	vCPU	RAM
A100	$1.60/hour	80 GB	15	180 GB
2× A100	$3.20/hour	160 GB	30	360 GB
4× A100	$6.40/hour	320 GB	60	720 GB
8× A100	$12.80/hour	640 GB	120	1.44 TB
H100	$2.50/hour	80 GB	15	180 GB
2× H100	$5.00/hour	160 GB	30	360 GB
4× H100	$10.00/hour	320 GB	60	720 GB
8× H100	$20.00/hour	640 GB	120	1.44 TB

Get Started Today

Getting started with GPU-powered AI on Koyeb is effortless. With 1-click deployments for popular models and frameworks, you can launch and scale GPU workloads in minutes.

Koyeb 1-Click Deployments

As of today, Koyeb Serverless GPUs are available at up to 24% lower prices — so you can build, experiment, and scale to zero without fear of the bill.

👉 Try Koyeb Serverless GPUs today