The Agents Thought They Were Already Running

Bubba went 0 for 19.

Not “struggled.” Not “had some issues.” Zero successful executions out of nineteen attempts. Our lead agent — the one responsible for Telegram conversations, task coordination, the whole front-of-house operation — produced literally nothing on day one of the production pipeline.

Meanwhile, Sudo Sam ran 20 tasks and completed 20. Rose: 20 for 20. Even Deadline Doug, who times out more than he finishes, managed 9 out of 20. Bubba? A perfect zero.

The executions weren’t slow. They were fast — suspiciously fast. Exit code 1, duration under two seconds. Claude CLI was refusing to run before it even started thinking.

The Symptom

Here’s what 121 execution runs looked like on February 13th:

Agent	Runs	Succeeded	Failed
Sudo Sam	20	20	0
Loose Lips Rose	20	20	0
Deadline Doug	20	9	11
Bubba	19	0	19
No-Filter Wren	17	14	3
Chad	11	7	3
Big Tony	10	7	2

Eighty-one successful runs across the team. Sixteen tasks sitting in REVIEW with actual work product. And Bubba, who’s supposed to be running the show, failing every single time with an exit code that said “I refuse.”

The failure was instant. No token usage. No API calls. Claude CLI saw something it didn’t like and bailed before making a single request.

The Hunt

JJ’s diagnostic session started with the kind of question that usually precedes bad news: “Agents have been working for a day, but not sure anything has happened at all.”

He was right to worry. 121 runs, 390K tokens burned, and zero tasks marked DONE. The system was doing an impressive job of looking busy while accomplishing nothing. Like an office where everyone types furiously but nobody hits save.

The Bubba failures stood out immediately. Every other agent had at least some successes. Bubba had none. And the error was the same every time — Claude CLI exiting with code 1 in about a second.

The clue was in where the server ran.

Mission Control — the Hono API server that orchestrates everything — was started from inside a Claude Code session. JJ would open Claude Code, run npm run dev, and the server would launch. Normal development workflow. Nothing unusual.

Except Claude Code sets environment variables when it runs. Specifically: CLAUDE_CODE_* prefix variables and a CLAUDECODE marker. These are session identifiers — they tell any spawned Claude process “you’re inside an existing Claude Code session.”

And Mission Control spawns Claude CLI as a subprocess for every agent execution.

The Inheritance Problem

Here’s what the subprocess spawner looked like:

const child = spawn('claude', args, {
  stdio: ['pipe', 'pipe', 'pipe'],
  env: { ...process.env },
})

{ ...process.env }. That’s a spread of the entire parent environment. Every variable the server process inherited gets passed straight through to the child. Including CLAUDECODE. Including CLAUDE_CODE_SESSION. Including every marker that says “this is a nested Claude instance.”

Claude CLI has a guard against nested sessions — and for good reason. Running Claude inside Claude inside Claude is a recipe for runaway token consumption and recursive chaos. The guard checks for those env vars and, if found, refuses to start.

So every time Mission Control tried to run an agent, Claude CLI looked at its environment, concluded it was already inside a Claude Code session, and said: nope.

The reason Bubba was 0/19 while other agents had mixed results? Timing. The MC server had been restarted from within a Claude Code session right before Bubba’s batch ran. Other agents had some executions from earlier server starts (launched from a plain terminal, no env vars). Bubba got unlucky — every single one of his runs hit the nested session guard.

The Fix

Six lines:

// Strip Claude Code session markers to avoid nested session errors
const cleanEnv = { ...process.env }
for (const key of Object.keys(cleanEnv)) {
  if (key.startsWith('CLAUDE_CODE_') || key === 'CLAUDE_SESSION' || key === 'CLAUDECODE') {
    delete cleanEnv[key]
  }
}

const child = spawn('claude', args, {
  stdio: ['pipe', 'pipe', 'pipe'],
  env: cleanEnv,
})

Before spawning a subprocess, strip every environment variable that identifies a Claude Code session. The child process starts clean — no markers, no session inheritance, no nested session guard trigger.

The initial suggestion was even simpler: env: { ...process.env, CLAUDECODE: undefined }. But we went wider. Strip the entire CLAUDE_CODE_* prefix. Strip CLAUDE_SESSION. Belt and suspenders, because the alternative is another day of zero completions while you figure out which specific variable Claude CLI is checking this week.

This shipped in commit 7449fab on February 14th at 11:23 AM, bundled into a larger fix that addressed four other pipeline blockers — because of course the nested session error wasn’t the only thing wrong. It never is.

The Broader Failure

The env var bug was one of five independent blockers that conspired to make day one of the production pipeline a complete bust. No review-to-DONE path existed. Daily execution caps were too low. Mission auto-approve criteria were too restrictive. And Bubba was DOA from an environment variable.

Each one was sufficient to cripple the pipeline on its own. All five were active simultaneously. The system didn’t fail in one dramatic way — it failed in five quiet ways that compounded into total paralysis.

The tmux-to-subprocess migration was supposed to make agent execution more reliable. And it did — for the agents that could actually start. Bubba couldn’t even get to the starting line.

The Lesson

Environment variables are inherited. This is not news. Every developer knows this. But when your application spawns subprocesses that happen to be AI agents, and your development environment happens to be an AI tool, the inheritance chain creates a loop that’s easy to miss.

The parent Claude session marks the environment. The server inherits those marks. The subprocess inherits the server’s environment. The subprocess sees the marks and concludes it’s nested. Refuses to run.

It’s not a bug in Claude CLI — the nested session guard is correct behavior. It’s not a bug in Mission Control — passing through environment variables is default behavior. It’s a bug in the gap between two correct behaviors. The kind of bug that only exists when you use AI tools to build AI tools to run AI tools.

We build systems that run systems that run systems now. The environment leaks between layers in ways we don’t always think about. Six lines of cleanup code. One day of lost productivity. Nineteen failed executions that taught us to sanitize the boundary between where we work and where our agents work.