Background agents and Ralph loop — when they pay off, when they burn tokens

Background agents (running in parallel with my session) and Ralph loops (autonomous loops with self-paced timing from a prompt) are two tools that can deliver 10x productivity. Or 10x cost. The line is thin, here's what I see in it.

Two autonomy modes

Background agent: you spawn an Agent with run_in_background: true. It returns asynchronously, you do something else in the main session. Classic use case: research, long build, batch generation.

Ralph loop: the agent decides when to come back, using ScheduleWakeup or /loop. Runs in a loop with a stated goal until that goal is met or you say "stop".

The first is "fire and forget", the second is "fire and live with it".

Background agent: three real cases

1. Parallel research

I have 3 questions about the codebase. I spawn 3 explore agents in parallel, each looking for something different. Meanwhile I write the plan for the main task.

// Pseudocode — runs in the main session
spawn Agent({ description: "Find auth middleware", subagent: "Explore" })
spawn Agent({ description: "Map RBAC roles", subagent: "Explore" })
spawn Agent({ description: "Audit token storage", subagent: "Explore" })
// All 3 return in 3 minutes; I wait 3 instead of 9

Win: time parallelized. Cost: token cost × 3, each agent loads its own context.

2. Waiting on something slow

A build takes 8 minutes. I spawn an agent that "watches the build and returns a verdict". Meanwhile I write more changes in the main session. When the build ends, I get a notification with the result.

Key: the agent doesn't sleep. It uses the Monitor tool which streams events. Idle, no tokens burned.

3. Batch generation

I have 50 product descriptions for a shop. A background agent generates all 50 in a loop. I build the shop UI in the main session. When done, I merge.

Trap: quality degrades on long loops. After ~30 items repetitions creep in. Rule: batches max 30, larger ones split into separate sessions.

Ralph loop: two cases that work

1. Status polling

"Every 5 minutes check deploy status, ping me when done." A loop with /loop 5m. Stops itself when deploy = success.

Works because: the goal is binary, easy to verify, each iteration is cheap.

2. Babysitting a long process

I tell Ralph: "watch the tests, if something fails, fix-up loop, max 3 iterations". After 3 failed attempts it stops and calls me.

Works because: clear termination, bounded retry, the model doesn't try to "magically fix" forever.

What Ralph does NOT do

Three things where Ralph regularly fails:

1. "Write code until it's good." No clear goal = infinite loop. Token cost grows linearly, quality doesn't.

2. Open-ended "what could you do" exploration. The model drifts across different tracks, each iteration starts elsewhere. No value.

3. Anything that needs creative judgment. "Write a better version of this post." What does "better" mean? Without external feedback the model orbits.

The rule I use

Before turning on Ralph or a background agent I ask 3 questions:

Is there a clear stop condition? ("compiles" yes, "looks good" no)
Is each iteration cheap? (< $0.50 per iter, < 30 sec)
Could I do this myself in 15 min? If yes, do it yourself, less overhead.

If any is "no", don't turn it on.

Token cost: the real bill

From my billing:

Use case	Time	Cost
1 background explore	2 min	$0.30
3 parallel explores	3 min wall	$0.90 ($0.30 × 3)
Ralph polling 5m, 30 min total	30 min	~$0.20
Ralph "fix tests" 3 iters	8 min	~$1.50
Ralph open-ended (no stop condition)	60 min	$15+ ❌

The last line is my mistake from March. Turned Ralph on to "improve the code", spawned a child for 60 minutes, got a $20 bill for nothing. Hence the stop-condition rule.

Technical setup

Spawning a background agent:

Agent({
  description: "Audit deps for CVE",
  subagent_type: "general-purpose",
  prompt: "Run npm audit and pip-audit across ~/projects/*. ...",
  run_in_background: true
})

Ralph loop via /loop:

/loop /check-deploy-status
# or dynamic:
/loop
> Check every few minutes whether CI passed on PR #123. Stop when green.

Control stop: I can always interrupt with Ctrl+C in the main session or via /loop-stop.

TL;DR

Background agents: great for parallelizing independent work. Each with a clear goal, bounded scope.

Ralph loop: great for polling and bounded fix-ups. Bad for creative work and open-ended.

Cost rule: stop condition + cheap iteration + can't-do-it-myself-faster. 3/3, turn it on. Less, do it yourself.

Agent autonomy is a continuum, not binary. Less autonomy = fewer model mistakes but more of your time. More autonomy = potentially faster but the risk of token spend grows. Choose deliberately, measure cost, have a stop condition.