Skip to content

Scale-to-Zero Services

Idle services stop and cost nothing. Restart automatically on first request.

service running → 5min idle → stop → memory freed → $0
new request → socket missing → spawn → route

1. Single-tenant app (no hibernation logic)

from flask import Flask
import os
app = Flask(__name__)
@app.route("/health")
def health():
return {"status": "ok"}
@app.route("/work", methods=["POST"])
def work():
return {"result": expensive_computation()}
if __name__ == "__main__":
port = int(os.getenv("PORT", "8000"))
app.run(host="127.0.0.1", port=port)

2. Configure tenement with idle timeout

[service.worker]
command = "python app.py"
health = "/health"
idle_timeout = 300 # Stop after 5 minutes idle
# Note: PORT env var is automatically set by tenement

When idle_timeout expires:

  • Instance stops
  • Socket is removed
  • Memory is freed
  • Cost: $0

3. Spawn per job/request

Terminal window
ten spawn worker --id job123

4. Wake on request

tenement’s routing detects missing socket and spawns automatically:

job123.api.example.com → socket missing → spawn worker:job123 → route
  • Zero cost when idle - Stopped services use no memory or CPU
  • Instant wake - First request spawns in <200ms (imperceptible)
  • No app changes - Service code has zero hibernation logic
  • Simple scaling - Just: ten spawn worker --id $job_id
Traditional: 1000 services always-on
├── 20MB per service = 20GB RAM
└── Cost: 10 machines @ $500/month
Scale-to-zero: 1000 services, ~2% active
├── 20 running × 20MB = 400MB RAM
└── Cost: 1 machine @ $5/month
Savings: 100x cheaper

Typical wake time: 65-220ms

  • Process spawn: 5-10ms
  • App startup: 50-200ms
  • Network round-trip: 5ms

Humans perceive ~250ms as instant, so this is imperceptible to users.