Why Caddy + user-scope systemd?
One-line answer: Caddy handles TLS, HTTP/3, reverse-proxy, and reload semantics with a 20-line config. systemd --user lets every 1bit systems process restart without root. Snapper catches any mistake. No Docker, no k8s, no YAML jungle.
Why Caddy, not nginx
- ACME + HTTP/3 + TLS 1.3 out of the box. Nginx needs certbot + openresty + a reload cron. Caddy has it built in.
- Config surface we actually touch is 3 directives:
reverse_proxy,tls,handle_path. That's the wholeCaddyfile. - Graceful reload on
caddy reload— no dropped connections, no pid juggling. - Structured JSON logs to the systemd journal.
journalctl -u caddy -o jsonis grep-able.
We run one Caddyfile at /etc/caddy/Caddyfile (root-owned because Caddy binds :443). A placeholder copy lives at strixhalo/caddy/Caddyfile with sk-halo-REPLACE_ME tokens — never replaced in git.
Why tls internal on LAN
All 1bit systems traffic rides Headscale (100.64.0.0/10). No public DNS, no public cert to manage. Caddy's tls internal spins up a tiny CA, mints certs for strixhalo.local, landing.strixhalo.local, etc. Clients trust the Headscale-pinned CA. Zero LetsEncrypt rate-limit risk.
landing.strixhalo.local {
tls internal
reverse_proxy 127.0.0.1:8190
}
If we ever expose a public endpoint, Caddy flips from tls internal to ACME automatically. One-line change.
Why systemd --user, not system units
- No root for restart.
systemctl --user restart strix-serveris a regular-user op. Scripts in~/.local/bindon't needsudo. - All state under
$HOME. Snapper's rootfs snapshot catches every service change. Rollback to snapshot#6undoes a bad unit install without touching/etc. - Isolation. 1bit-server runs as
bcloud, notroot. A serve bug can't touch/etc. - Per-user tuning.
~/.config/systemd/user/*.serviceis editable without a config manager.
Units live at strixhalo/systemd/ in git. install.sh copies them to ~/.config/systemd/user/ and runs systemctl --user daemon-reload.
loginctl enable-linger
Without it, user services stop on logout. One command, done forever:
loginctl enable-linger bcloud
Now 1bit-server + strix-landing + 1bit-mcp survive SSH disconnect, reboot, TTY switch. The box is headless in a closet — lingering is non-negotiable.
Why we do NOT use Docker
Rule A (bare-metal-first, see feedback_bare_metal_first.md) forbids containers in the runtime path. Reasons, in order:
- Cold start. Container boot adds 300-800 ms on top of process start. For a service called by voice pipelines that target <2 s end-to-end, that's a third of the budget.
- Storage driver churn. Overlay2 + Btrfs = unpredictable layer timing under snapper.
- Attack surface. Every daemon (
dockerd,containerd) is another root-owned service to patch. On a home box, we'd rather patch zero. - GPU passthrough. HIP in a container needs
--device=/dev/kfd+--device=/dev/dri+ group mapping. Works, but is a chore. Bare-metal1bit-serversees the GPU directly. - Debugging.
straceon bare-metal is grep-able; in a container you're layering namespaces.
Caller-side is different — if you want to run 1bit-cli inside Docker on your laptop, fine. The service side stays bare.
Pointers
- Units:
strixhalo/systemd/ - Caddyfile (tracked):
strixhalo/caddy/Caddyfile - Bootstrap:
install.sh