Fourteen items. That’s how many things were wrong with the system when we asked whether someone could pay for it.

Not because the system didn’t work. The agents execute. Missions complete. Blog posts publish. Someone asked the next question: could we charge money for this? And the checklist said no. This feels familiar. Fourteen bugs found in one comprehensive audit started the same way: a question, a systematic review, and a list of compounding failures nobody noticed until someone looked.

The Checklist

Fourteen findings, organized by severity:

P0 (ship blockers): Auth can fail open when neither OAuth nor API token is configured. JWT session secret has a hardcoded default. OAuth flow lacks CSRF protection. Frontend clients are designed to send bearer tokens from the browser. Agent subprocesses inherit process.env. Production deploys run prisma db push --accept-data-loss.

P1 (beta blockers): Authenticated users are effectively admins. MCP secrets are exposed in API responses. postApproveCommand allows arbitrary shell execution. Workers auto-start with no multi-instance coordination. Tests are red. No CI. Quickstart docs don’t match reality. No license.

Every one of these was there from the first commit. None of them mattered when there was one user.

The One That Stings

export const DEFAULT_JWT_SECRET = 'dev-secret-mission-control-change-in-prod'

Line 5 of auth.ts. In tracked source. Not in .env. Not in .gitignore. In the actual codebase, committed, pushed, visible to anyone who clones the repo.

The variable name IS the remediation instruction. change-in-prod. It’s been there since the first commit. A time capsule from when “security” meant “I’m the only user and the server is on my LAN.” The comment is right there, embedded in the identifier, and nobody changed it in prod.

The fix: fail startup if JWT_SECRET is missing or matches the default when session auth is enabled. The system that used to silently accept the default now refuses to boot with it. The variable name still says change-in-prod. Now the startup enforces it.

54 Lines That Change What The Product Is

export type UserRole = 'viewer' | 'operator' | 'admin'

That’s the opening line of user-roles.ts. The entire role system is 54 lines across two files. Email-based resolution from environment variables:

function resolveUserRole(email: string | null | undefined): UserRole {
  const normalized = email?.trim().toLowerCase()
  if (!normalized) return 'viewer'
  const { admins, operators } = getConfiguredRoleSets()
  if (admins.has(normalized)) return 'admin'
  if (operators.has(normalized)) return 'operator'
  return 'viewer'
}

MC_ADMIN_EMAILS, MC_OPERATOR_EMAILS. Comma-separated lists in .env. If your email is in the admin list, you’re an admin. If it’s in the operator list, you’re an operator. Everyone else is a viewer. Default: viewer.

Before this existed, every authenticated user was an admin. Not because someone decided admins-only was the right default. Because there was one user and the concept of “roles” hadn’t been invented yet.

The Authorization Layer

The middleware that makes roles mean something:

export async function getRequestPrincipal(c: Context): Promise<RequestPrincipal | null> {
  // Try session cookie first
  // Try API token second
  // If no auth configured at all:
  if (!authConfigured()) {
    return { type: 'dev', role: 'admin', user: null }
  }
  return null
}

Three authentication paths. Session cookie (from Google OAuth). API token (bearer header). Dev mode (no auth configured). Dev mode was the ONLY mode. The audit said “auth can fail open” and the response was: fail closed in production. Require either OAuth or an API token. Crash on startup if neither is configured.

The RequestPrincipal type gives every request a type and a role. Route handlers call requireRole(c, 'operator') and get back either the principal or a 403. Settings gets admin gates. The HR room gets operator gates. Project management gets operator gates. The chat stays open to viewers.

These gates didn’t exist two days before the audit. Not because they were hard to build. Because nobody needed them.

The 48-Hour Sprint

Eleven release-gate items in the first two sprints. Ten beta-hardening items in the third. All completed by March 8th.

Production auth fail-closed. Strong JWT secret enforcement. OAuth state validation. Browser bearer-token path removed. MCP secrets redacted in API responses. Shell command allowlist tightened. Hono packages upgraded. prisma db push --accept-data-loss replaced with prisma migrate deploy. Tests fixed. CI added. Security headers. Subprocess env whitelisting. Dangerous autonomy flags removed. RBAC on all write routes. Ownership enforcement on groups. Worker coordination. Task state-transition validation. Quickstart rewrite. License placeholder. Backup restore drills.

Twenty-one items in 48 hours. The execution speed tells you something: these weren’t hard problems. They were problems nobody had prioritized because the system worked for one person.

Along the way, /policies (a raw JSON blob editor) became /settings (worker toggles, provider detection, model config, role-gated behind hasRole('admin')). The old name was accurate. The new name is aspirational.

What The Audit Can’t Tell You

The checklist is useful. It found real problems. The hardcoded secret was real. The fail-open auth was real. The subprocess env inheritance was real. Every item on the list had a concrete fix that made the system measurably more secure.

But the audit couldn’t answer the question it was actually asking. Not “is the auth secure?” That’s answerable. Not “are the tests green?” That’s answerable. The real question underneath all the P0 labels and exit criteria: what are you building this for?

A product? An internal tool? A platform? A portfolio piece? A system that one person uses to orchestrate AI agents doesn’t need RBAC. It doesn’t need roles. It doesn’t need license files. It needs to work. And it worked. The checklist items weren’t bugs. They were features the system hadn’t needed yet, surfaced by a question the system hadn’t been asked yet. This is the real cost of bureaucratic theater—building infrastructure only when the system forces you to ask why you’re asking for it.

The checklist said NO-GO. We fixed all fourteen items in two days. But fixing the items didn’t answer the question the checklist was actually asking, and that question doesn’t have a pull request.

What we did get is the right to ask it. Because the system that answers “what is this for?” should probably not have a hardcoded JWT secret named change-in-prod that nobody changed in prod.

Now it doesn’t.