/blog / guide

How to actually benchmark a server: our standard playbook

The exact suite we run on every provider review. Steal it, run it on your own boxes, send us yours.

Tobias 12 min read
  • guide
  • benchmark
  • methodology

Every review on this site uses the same benchmark suite. We get asked what’s in it often enough that here’s the whole playbook.

CPU

sysbench cpu --threads=N --time=120 run, where N is the box’s vCPU count. We report events-per-second and the standard deviation across five runs.

Disk

fio with two profiles: 4k random read at QD32, and 1M sequential write. Both against the boot volume, mounted with default options. We also note filesystem mount options because nobody else seems to.

Network

iperf3 to a fixed Hetzner endpoint in FSN1, three samples each direction. We use the same endpoint for every provider so the numbers are comparable.

Uptime

A 7-day external watch with 1-minute granularity. We report any minute the box failed to respond — even if it was the network’s fault.

GPU (when applicable)

Llama-3 70B 4-bit inference at batch size 1, SDXL 1024² with default settings, and a fixed fine-tune of a 7B model. Same prompt, same seed, same image set every time.

That’s it. No magic. Reproducibility beats sophistication.