Hardware Reality Check

Choose your path — every wrong choice here costs you hours.

This site exists because the official docs didn't warn us about the hardware reality.

Decision Tree

Question 1: Do you have a GPU with 16GB+ VRAM?

→ Yes: Run locally with Ollama (see Hardware section below)

→ No: Go to Question 2

Question 2: Can you spend billable hourly rates for cloud GPU?

→ Yes: Rent a GPU (see VPS section below)

→ No: Use API services (cheaper upfront)

Question 3: Do you need 24/7 operation?

→ Yes: VPS or dedicated hardware

→ No: Local machine or on-demand cloud

ā˜ļø Cloud GPU (VPS)

The only way to sleep at night if you don't have 24GB+ VRAM.

Vultr High Frequency GPU

āœ“ The only way to sleep at night:

  • • No local hardware drama
  • • Turn it off when you're done
  • • No worrying about electricity bills

āœ— Works, but you will suffer:

  • • Long-term 24/7 operation (cost adds up: ~$360/mo)
  • • Need to transfer data to/from cloud
From billable hourly rates (A100/A6000)

Other Options

RunPod, Lambda Labs, Vast.ai offer similar GPU rental services. Pricing varies by availability and region.

Always check: actual GPU model, VRAM, and per-hour cost before committing.

šŸ’» Local Hardware

Buy once, cry once. Or buy cheap, cry every day.

Mac Mini (M4/M4 Pro, 16GB+)

āœ“ The only way to sleep at night:

  • • 24/7 operation (low power, silent)
  • • Running quantized models (7B-14B)
  • • Fully offline, no API costs

āœ— Works, but you will suffer:

  • • Running full 32B+ models (not enough VRAM)
  • • 3.2 tokens/sec on 8B models (painfully slow)
From ~$449 (16GB RAM minimum, 24GB recommended for serious use)

NVIDIA GPU (4060 Ti 16GB+ or used 3090 24GB)

āœ“ The only way to sleep at night:

  • • Windows/Linux users
  • • Running larger models (up to 32B with 24GB VRAM)
  • • CUDA acceleration (fastest option)

āœ— Works, but you will suffer:

  • • Users with under 16GB VRAM (see crash logs)
  • • Mac users (no CUDA support)
  • • Used 3090s are mined-out or have fan issues
From ~$320 new (16GB) | ~$700 used (3090 24GB - buyer beware)

šŸ”‘ API Services

The only way to sleep at night if you want zero hardware drama.

DeepSeek API

āœ“ The only way to sleep at night:

  • • R1 reasoning without hardware drama
  • • Development and testing
  • • Casual use (~$1-5/mo)

āœ— Works, but you will suffer:

  • • Data goes to their servers (privacy tradeoff)
  • • Rate limits during peak hours (9-11AM Beijing)
  • • High-volume production (API costs add up fast)
Pay-per-use (~$0.14/M input, ~$0.28/M output)

Not sure if your hardware can handle it?

Read the crash logs before you buy anything. These are real failures from real hardware.

Read Crash Logs →