Scale-to-Zero Services
Idle services stop and cost nothing. Restart automatically on first request.
service running → 5min idle → stop → memory freed → $0new request → socket missing → spawn → route1. Single-tenant app (no hibernation logic)
from flask import Flaskimport os
app = Flask(__name__)
@app.route("/health")def health(): return {"status": "ok"}
@app.route("/work", methods=["POST"])def work(): return {"result": expensive_computation()}
if __name__ == "__main__": port = int(os.getenv("PORT", "8000")) app.run(host="127.0.0.1", port=port)2. Configure tenement with idle timeout
[service.worker]command = "python app.py"health = "/health"idle_timeout = 300 # Stop after 5 minutes idle# Note: PORT env var is automatically set by tenementWhen idle_timeout expires:
- Instance stops
- Socket is removed
- Memory is freed
- Cost: $0
3. Spawn per job/request
ten spawn worker --id job1234. Wake on request
tenement’s routing detects missing socket and spawns automatically:
job123.api.example.com → socket missing → spawn worker:job123 → routeWhy This Works
Section titled “Why This Works”- Zero cost when idle - Stopped services use no memory or CPU
- Instant wake - First request spawns in <200ms (imperceptible)
- No app changes - Service code has zero hibernation logic
- Simple scaling - Just:
ten spawn worker --id $job_id
Economics
Section titled “Economics”Traditional: 1000 services always-on├── 20MB per service = 20GB RAM└── Cost: 10 machines @ $500/month
Scale-to-zero: 1000 services, ~2% active├── 20 running × 20MB = 400MB RAM└── Cost: 1 machine @ $5/month Savings: 100x cheaperCold Start Reality
Section titled “Cold Start Reality”Typical wake time: 65-220ms
- Process spawn: 5-10ms
- App startup: 50-200ms
- Network round-trip: 5ms
Humans perceive ~250ms as instant, so this is imperceptible to users.