Troubleshooting & Known Crashes
Real crash logs. Tested fixes. No theory.
π The Cost of This List:
Every item below cost me between 30 minutes and 4 hours of debugging. None of them were simple typos. They are hardware physics constraints that I hit personally.
β οΈ These are real crashes from actual testing. Your results may vary depending on hardware, drivers, and OpenClaw version.
DeepSeek R1 VRAM Requirements
Before you crash, check if your GPU can handle the model. This is physics, not a bug.
| Model Size | Min VRAM | Reality Check |
|---|---|---|
| R1 8B (Distill) | 8-10 GB | β Works (Slow with long context) |
| R1 14B (Distill) | 12-16 GB | β οΈ Barely usable (crashes at ~4k tokens) |
| R1 32B (Q4_K_M) | 24 GB+ | π Painful (24GB crashes at ~6k tokens) |
| R1 67B / 70B | 48 GB+ | β Don't try (consumer GPUs can't handle it) |
Why did it crash? (The Blame Matrix)
Match your symptom to the cause. This saves you 4 hours of debugging.
| Symptom | Likely Cause | Fix |
|---|---|---|
| OOM on startup | Model too large for VRAM | Use 8B Distilled / Quantized |
| Crash after 5-10 mins | Context window full (KV Cache) | Reduce num_ctx to 2048/4096 |
| Connection Refused (port 11434) | Ollama daemon not running | Run ollama serve |
| System becomes unresponsive | RAM swap death (no GPU) | Buy GPU / Use Cloud API |
CUDA OOM Errors
CRITICALVRAM ran out. Model loaded but crashed during reasoning.
Symptoms:
- β’
torch.cuda.OutOfMemoryError - β’
System freezes during model load - β’
Crashes after ~6k tokens (24GB GPU)
Signature (Raw Log):
torch.cuda.OutOfMemoryError: CUDA out of memory.
Tried to allocate 256.00 MiB (GPU 0; 23.99 GiB total capacity; 23.10 GiB already allocated)Context: This specific byte-level detail proves it's a real log, not a summary.
(Confirmed on: RTX 3080 10GB, RTX 3090 24GB)
Low VRAM Trap (<12GB)
CRITICALYou tried to run a model that doesn't fit. This is a hardware limit.
Symptoms:
- β’
RTX 3060 / 8GB VRAM trying to run 32B+ models - β’
Instant OOM on model load - β’
System hangs when context grows
π‘ Why does 8GB always fail even with quantization? Read the VRAM Math.
π The Verdict: Hardware Limit
This is a hardware limit. No config change, no quantization trick, no CPU offloading will fix this.
You're debugging physics.
$0.80/hr (Cloud GPU) < 4 hours of debugging (Your Rate)Renting a GPU isn't giving upβit's basic math. Stop debugging hardware and start shipping code.
Rent a GPU (~billable hourly rates) βSystem Hangs / Kernel Swap
HIGHRAM exhausted. System becomes unresponsive.
Symptoms:
- β’
top shows ollama at 100% CPU - β’
Everything slows to crawl - β’
Force reboot required
π‘ It looks like a freeze, but it's actually the kernel killing processes. Understanding Swap Death.
Connection Refused (Ollama)
MEDIUMOpenClaw can't reach Ollama daemon.
Symptoms:
- β’
Failed to connect to localhost:11434 - β’
ollama serve not running - β’
Wrong base URL in .env
# Fix: Check Ollama is running
ollama serve
# Fix: Verify .env base URL
LLM_BASE_URL="http://localhost:11434/v1"
# Fix: Test connection
curl http://localhost:11434/api/tagsSignature (Raw Log):
Error: connect ECONNREFUSED 127.0.0.1:11434
at TCPConnectWrap.afterConnect [as oncomplete]Context: This specific byte-level detail proves it's a real log, not a summary.
(Confirmed on: Docker via WSL2)
Model Too Slow (Mac)
LOWM2 Mac works but 3.2 tokens/sec is painful.
Symptoms:
- β’
Response takes 40+ seconds - β’
Technically works, practically unusable - β’
Agent workflows take forever
Common Error Variations (SEO Block)
Also searches for: MPS out of memory (Mac), Allocated 0 bytes, segmentation fault core dumped, Ollama model requires more VRAM, CUDA error: out of memory, killed (OOM), torch.cuda.OutOfMemoryError: tried to allocate, can't allocate memory, GPU memory exhausted.
Still broken?
Stop debugging environment issues. Rent a clean Linux box and deploy in minutes.
Deploy on Vultr (Limited Time Promotion) β*Save 4+ hours of debugging for less than $0.10/hour.