Background agents and Ralph loop — when they pay off, when they burn tokens
Background agents and Ralph-style loops are powerful, but easy to turn into a token furnace. Showing my three real cases and the rule for when to even turn this on.

Background agents (running in parallel with my session) and Ralph loops (autonomous loops with self-paced timing from a prompt) are two tools that can deliver 10x productivity. Or 10x cost. The line is thin, here's what I see in it.
Two autonomy modes
Background agent: you spawn an Agent with run_in_background: true. It returns asynchronously, you do something else in the main session. Classic use case: research, long build, batch generation.
Ralph loop: the agent decides when to come back, using ScheduleWakeup or /loop. Runs in a loop with a stated goal until that goal is met or you say "stop".
The first is "fire and forget", the second is "fire and live with it".
Background agent: three real cases
1. Parallel research
I have 3 questions about the codebase. I spawn 3 explore agents in parallel, each looking for something different. Meanwhile I write the plan for the main task.
// Pseudocode — runs in the main session
spawn Agent({ description: "Find auth middleware", subagent: "Explore" })
spawn Agent({ description: "Map RBAC roles", subagent: "Explore" })
spawn Agent({ description: "Audit token storage", subagent: "Explore" })
// All 3 return in 3 minutes; I wait 3 instead of 9Win: time parallelized. Cost: token cost × 3, each agent loads its own context.
2. Waiting on something slow
A build takes 8 minutes. I spawn an agent that "watches the build and returns a verdict". Meanwhile I write more changes in the main session. When the build ends, I get a notification with the result.
Key: the agent doesn't sleep. It uses the Monitor tool which streams events. Idle, no tokens burned.
3. Batch generation
I have 50 product descriptions for a shop. A background agent generates all 50 in a loop. I build the shop UI in the main session. When done, I merge.
Trap: quality degrades on long loops. After ~30 items repetitions creep in. Rule: batches max 30, larger ones split into separate sessions.
Ralph loop: two cases that work
1. Status polling
"Every 5 minutes check deploy status, ping me when done." A loop with /loop 5m. Stops itself when deploy = success.
Works because: the goal is binary, easy to verify, each iteration is cheap.
2. Babysitting a long process
I tell Ralph: "watch the tests, if something fails, fix-up loop, max 3 iterations". After 3 failed attempts it stops and calls me.
Works because: clear termination, bounded retry, the model doesn't try to "magically fix" forever.
What Ralph does NOT do
Three things where Ralph regularly fails:
1. "Write code until it's good." No clear goal = infinite loop. Token cost grows linearly, quality doesn't.
2. Open-ended "what could you do" exploration. The model drifts across different tracks, each iteration starts elsewhere. No value.
3. Anything that needs creative judgment. "Write a better version of this post." What does "better" mean? Without external feedback the model orbits.
The rule I use
Before turning on Ralph or a background agent I ask 3 questions:
- Is there a clear stop condition? ("compiles" yes, "looks good" no)
- Is each iteration cheap? (< $0.50 per iter, < 30 sec)
- Could I do this myself in 15 min? If yes, do it yourself, less overhead.
If any is "no", don't turn it on.
Token cost: the real bill
From my billing:
| Use case | Time | Cost |
|---|---|---|
| 1 background explore | 2 min | $0.30 |
| 3 parallel explores | 3 min wall | $0.90 ($0.30 × 3) |
| Ralph polling 5m, 30 min total | 30 min | ~$0.20 |
| Ralph "fix tests" 3 iters | 8 min | ~$1.50 |
| Ralph open-ended (no stop condition) | 60 min | $15+ ❌ |
The last line is my mistake from March. Turned Ralph on to "improve the code", spawned a child for 60 minutes, got a $20 bill for nothing. Hence the stop-condition rule.
Technical setup
Spawning a background agent:
Agent({
description: "Audit deps for CVE",
subagent_type: "general-purpose",
prompt: "Run npm audit and pip-audit across ~/projects/*. ...",
run_in_background: true
})Ralph loop via /loop:
/loop /check-deploy-status
# or dynamic:
/loop
> Check every few minutes whether CI passed on PR #123. Stop when green.Control stop: I can always interrupt with Ctrl+C in the main session or via /loop-stop.
TL;DR
Background agents: great for parallelizing independent work. Each with a clear goal, bounded scope.
Ralph loop: great for polling and bounded fix-ups. Bad for creative work and open-ended.
Cost rule: stop condition + cheap iteration + can't-do-it-myself-faster. 3/3, turn it on. Less, do it yourself.
Agent autonomy is a continuum, not binary. Less autonomy = fewer model mistakes but more of your time. More autonomy = potentially faster but the risk of token spend grows. Choose deliberately, measure cost, have a stop condition.