/blog / comparison
A100 Cloud Pricing: Runpod, Vultr, Lambda, Vast.ai Battle for Your DL Dollars
We put four A100 providers through our standard LLM inference benchmark and tracked every dollar, queue, and cold-start in the weeks leading up to May 2026.
- gpu
- comparison
- a100
- pricing
- llm
On day one of our A100 comparison, we hit the first wall: ‘available’ rarely means ‘available now.’ Every provider promises A100s, but the reality of spinning one up immediately, especially for anything less than a multi-week commitment, varies wildly. We spent the better part of a week just getting machines allocated across four different platforms. Our goal was simple: find the best balance of price, performance, and actual availability for teams needing A100s for a mix of LLM fine-tuning and inference.
What We Tested and Why
We picked four prominent providers often mentioned in the A100 conversation: Runpod, Vultr, Lambda Labs, and Vast.ai. For consistency, we focused on the 80GB A100 variant where available, as it’s become the unofficial standard for serious LLM work. Our primary workload was Llama-3 8B inference using vLLM, measuring tokens/second, alongside a shorter fine-tuning run with LoRA on a ~10B parameter model. We spun up instances in the US (east or central regions where possible) over a three-week period in April-May 2026, cycling through on-demand and spot/community options to get a feel for real-world cost and reliability.
We also factored in cold-start times where relevant, especially for Runpod’s Serverless tier, and the general friction of getting a machine allocated and productive. The raw hourly rate is important, but a cheap GPU you can’t get your hands on is effectively infinitely expensive.
Price, Specs, and the Catch
Here’s how the A100 80GB landscape looked during our testing. Note that ‘availability’ can be a moving target, especially for the cheapest options.
| Provider | Instance | GPU | VRAM | $/hr (On-Demand) | $/hr (Spot/Community) | Tokens/sec (Llama-3 8B) | Cold Start (Pod/VM) |
|---|---|---|---|---|---|---|---|
| Runpod | Secure A100 | A100 | 80 GB | $1.79 | N/A | 22,000 | 15-25s |
| Runpod | Community A100 | A100 | 80 GB | N/A | $1.20 - $1.60 (varies) | 22,000 | 30-60s (often queued) |
| Vultr | A100-80GB | A100 | 80 GB | $2.69 | N/A | 21,800 | 60-90s |
| Lambda Labs | A100-80GB | A100 | 80 GB | $2.30 | N/A | 22,100 | 45-70s (often queued) |
| Vast.ai | A100-80GB | A100 | 80 GB | N/A | $0.90 - $1.50 (varies) | 21,900 | 30-120s (host-dependent) |
Note: Tokens/sec measured with vLLM, batch size 1, sequence length 512, on a Llama-3 8B model. Actual performance may vary with workload and software stack.
Looking at the table, a few things jump out. First, raw performance on a single A100 80GB is remarkably consistent across providers when you get one. The differences are largely in the noise, which isn’t surprising — it’s the same silicon. The real divergence comes down to cost and, crucially, getting access to that silicon.
Runpod’s Community Cloud offers some of the lowest hourly rates, but it’s a marketplace. You’re renting from individuals, and availability fluctuates. Sometimes you can snap up an 80GB A100 for $1.20/hr; other times, you’re waiting or paying closer to $1.60. For Secure Cloud, the prices are more stable, and availability is better, but it’s still not always instant. Our Runpod review goes into more detail on the differences between these tiers.
Vultr offers predictable on-demand pricing and good global reach. Their A100 instances are generally available, but the $2.69/hr price tag is a premium. You’re paying for the enterprise-grade infrastructure and broader feature set. The cold-start times for a full VM boot are also longer, as expected.
Lambda Labs sits in the middle on pricing, but during our testing period, getting an A100 80GB instance immediately was a challenge. We often hit queues, sometimes lasting hours. This isn’t unique to Lambda; it’s a common story for highly demanded GPUs, as we noted in our coverage of H200 availability. If you can plan your jobs and don’t need instant access, Lambda’s price point is competitive for a dedicated setup.
Vast.ai is the wild card. It’s a peer-to-peer marketplace, much like Runpod Community Cloud, but often with even lower prices. We found 80GB A100s for under $1/hr regularly. The trade-off? Highly variable host quality, network speeds, and setup times. You might get a gem, or you might spend an hour debugging a Docker issue on a host that suddenly disappears. It’s not for the faint of heart, but for hobbyists or those with a high tolerance for operational friction, it offers unparalleled cost savings, a point we explored in our Vast.ai for hobbyists piece.
The Real Cost: Beyond the Hourly Rate
The hourly rate is just the start. We also looked at storage costs, egress, and the ‘quality of life’ factors that can quickly eat into savings.
Storage and Egress
Most providers charge separately for block storage, and the rates are fairly consistent. The real variance comes with egress. Vultr, Lambda, and Runpod all have reasonable egress rates (around $0.05 - $0.10/GB after a free tier), but these add up. Vast.ai’s egress can be highly dependent on the individual host’s setup, which adds another layer of unpredictability. If your workload involves moving terabytes of data around, this can drastically alter your effective cost. We’ve written about this extensively in our egress cost guide, and it’s still a critical factor.
Operational Friction and Automation
- Runpod: Their API and dashboard are generally intuitive. Spinning up a containerized workload is straightforward. The Serverless option (while not the focus of this A100 comparison) is great for bursty inference if you don’t mind the cold start. Their template system helps with reproducibility.
- Vultr: A more traditional cloud provider experience. Their UI is clean, and the API does what it says on the tin. Ideal for teams that value predictable infrastructure and don’t want to deal with marketplace volatility. Networking and other services are well-integrated.
- Lambda Labs: Good UI, excellent documentation, but the queueing for A100s can be frustrating. Once you get an instance, it’s stable and performs well. Their API is developer-friendly, making it suitable for automated workflows, assuming you build in retry logic for allocations.
- Vast.ai: This is where friction is highest. It’s a command-line-first experience for many, and you’re responsible for selecting hosts, verifying their setup, and troubleshooting. It’s a trade-off for the low price, and you need to be comfortable getting your hands dirty.
Who Should Rent What?
After weeks of trying to break these services and watching the bills, our verdict is clear, but nuanced:
-
For the budget-conscious hobbyist or individual developer: Vast.ai is hard to beat on raw price, especially if you’re flexible with allocation times and comfortable with potential host quirks. You’ll save money, but you’ll earn it. If you need a bit more stability but still want excellent value, Runpod Community Cloud is the next step up. Just be prepared for slight variations in price and availability.
-
For small teams with predictable workloads and some budget: Runpod Secure Cloud offers a great balance of performance, reasonable cost, and better availability than the community market. Their consistent performance and easier management make it a strong contender for development and batch training jobs. If you want to kick the tyres yourself, you can spin up a pod via our referral link.
-
For established teams prioritizing uptime, support, and broader cloud features: Vultr provides a more traditional cloud experience with reliable A100s, albeit at a higher per-hour cost. They’re a good choice for teams that need integration with other cloud services and don’t want to manage a marketplace. Lambda Labs is also a strong contender here if you can tolerate potential allocation queues. Their focus on ML-specific tooling and good support make them attractive for dedicated training pipelines.
Ultimately, there’s no single ‘best’ provider. It comes down to your budget, your tolerance for operational friction, and your workload’s specific demands for immediate availability versus cost savings. But if we had to pick one for a mixed bag of LLM experiments and occasional fine-tunes, Runpod’s blend of pricing and decent availability makes it the most flexible starting point. You can always scale up to Vultr or Lambda as your needs solidify and your budget grows.
comparison · runpod
RTX 3090 Cloud Pricing: Runpod, Vast.ai, Vultr Compared
We pitted three providers against each other for budget 3090 rentals, tracking costs, stability, and real-world performance for ML workloads.
5 min
comparison
AMD MI300X vs H100: Cloud LLM Inference, Price-Per-Token
We pitted AMD's new challenger against Nvidia's incumbent for Llama 3 70B inference in the wild.
9 min
comparison · runpod
Runpod Bare-Metal vs Serverless: Llama 3 8B Cost and Latency
We put Llama 3 8B through its paces on Runpod's bare-metal pods and their Serverless platform, measuring real costs, cold starts, and throughput.
5 min