The problem
Your main OpenClaw instance goes down. Maybe it crashed, maybe the container restarted, maybe you pushed a bad config. You're on your phone. You don't want to SSH in, dig through logs, and run CLI commands from a tiny terminal.
What if you could just text something and say "restart my main instance"?
The pattern
Run a second OpenClaw instance — the watcher — on the same machine. Connect it to the same Discord server (or Telegram, or whatever you use). Give it its own channel.
When your main instance dies, you open the watcher's channel and type something like:
"Main claw is down. Restart it."
The watcher has access to the same machine. It can run docker restart, systemctl restart, check logs, report back what happened — all through chat. You never touch a terminal.
That's it. That's the core pattern. Everything else is optional.
Setting it up
OpenClaw's --profile flag isolates everything — config, sessions, workspace — under ~/.openclaw-<name>. The watcher gets its own profile so it's completely independent from your main instance.
# Give the watcher its own port so it doesn't conflict with the main instance openclaw --profile watcher config set gateway.port 18791 --strict-json # Use a cheap model — the watcher doesn't need to be smart, # it needs to run shell commands and report back openclaw --profile watcher config set ai.model "openrouter/meta-llama/llama-3.3-70b-instruct:free"
The main instance runs on the default port (18789). The watcher runs on 18791. They're completely separate processes with separate state.
Create an AGENTS.md in the watcher's workspace so it knows what it's for:
# File: ~/.openclaw-watcher/workspace/AGENTS.md # Watcher You are the backup instance. Your job is to monitor and recover the main OpenClaw instance when asked. The main instance runs as: docker container "openclaw-main" (or: systemd service "openclaw" — adjust to match your setup) ## What you can do - Restart the main instance: docker restart openclaw-main - Check if it's running: docker ps | grep openclaw-main - Check its health: curl -s http://localhost:18790/health - Read its logs: docker logs --tail 50 openclaw-main - Check system resources: free -m, df -h, uptime ## How to behave - When asked to restart, do it immediately and confirm the result. - After a restart, check /health to verify it came back. - If it won't come back after 2 tries, say so clearly. - Always report what you find — don't just say "done."
Run the OpenClaw setup for the watcher profile and connect it to your chat platform. On Discord, you can use the same bot — just route it to a different channel (like #watcher).
# Run the interactive setup for the watcher profile openclaw --profile watcher configure
Point it at a dedicated channel. Now you have a chat interface to a backup instance that can poke around on the machine when your main one is offline.
# Test it openclaw --profile watcher gateway start # Then make it permanent — pick one: # Docker: add a second service in docker-compose.yml # Systemd: create a service file (example below) # PM2: pm2 start "openclaw --profile watcher gateway start" --name watcher
Systemd example:
# /etc/systemd/system/openclaw-watcher.service [Unit] Description=OpenClaw Watcher Instance After=network.target [Service] User=your-user WorkingDirectory=/home/your-user Environment="HOME=/home/your-user" ExecStart=/usr/local/bin/openclaw --profile watcher gateway run Restart=always RestartSec=10 [Install] WantedBy=multi-user.target
sudo systemctl enable openclaw-watcher sudo systemctl start openclaw-watcher
That's the full setup. You now have a backup agent you can talk to from your phone when things go wrong.
Optional: automate the health checks
The manual version is already useful — but you can also make the watcher check on the main instance automatically using a cron job.
OpenClaw's gateway exposes a /health endpoint that returns a 200 when things are working. The watcher can hit that endpoint on a schedule and restart the main instance if it's not responding.
openclaw --profile watcher cron add \
--name "Health check" \
--cron "*/5 * * * *" \
--session isolated \
--message "Health check: \
1. Hit http://localhost:18790/health — expect a 200 within 10 seconds. \
2. If it fails or times out, restart with: docker restart openclaw-main \
3. Wait 30 seconds, then check /health again. \
4. If it still fails after 2 attempts, send a message saying manual intervention is needed."
Now the watcher handles restarts on its own. You only hear about it when something is actually broken. You can also add resource monitoring:
openclaw --profile watcher cron add \
--name "Resource check" \
--cron "*/10 * * * *" \
--session isolated \
--message "Check system resources: \
1. Run 'free -m' — alert if available memory is under 200MB. \
2. Run 'uptime' — alert if 1-min load average is over 4.0. \
3. Only report if something is wrong."
Bonus: use a free model
The watcher doesn't need a smart model. It needs to run docker restart, hit a health endpoint, and report back. That's well within the capability of any free-tier model on OpenRouter.
The config already shows this — llama-3.3-70b-instruct:free is a solid choice. It's fast, it follows instructions reliably, and it costs nothing. Save your paid model budget for the main instance that handles real work.
If you want something more instruction-tuned, Hermes models (by Nous Research, available on OpenRouter) are worth a look — they tend to follow system prompts more precisely, which matters when your watcher needs to execute specific commands without improvising.
You can also delegate complex debugging to a smarter model via OpenClaw's ACP harness. Set up Claude Code or Codex as an allowed ACP agent on the watcher profile, and when something genuinely tricky happens the watcher can spawn a coding agent session to diagnose it:
# Allow coding agents on the watcher profile openclaw --profile watcher config set acp.enabled true openclaw --profile watcher config set acp.allowedAgents '["codex","claude"]' --strict-json
That way simple health checks run on the free model, and anything that needs real reasoning gets handed off. Best of both worlds — minimal cost for the 99% case, full capability available when you need it.
Why this works
The watcher survives because it's a separate process. When the main instance crashes, the watcher is unaffected — different profile, different port, different process. As long as the machine itself is up, you have a chat interface to it.
It also costs almost nothing. A watcher on a free model that only responds when you message it (or runs a cron every few minutes) uses negligible resources. The ROI is the first time you fix a crash from your phone instead of opening a laptop.