Architecture

$ cat ./series/architecture

15 posts

The technical decisions that shaped the system, from god files to polling loops.

01

Feb 4, 2026 · 5 min read

Learning From the Competition

Yesterday we fixed the open-source assistant's personality problem. Today we realized I had the same bug.
02

Feb 5, 2026 · 6 min read

The Memory That Forgot Itself

When your AI's memory system returns 'No matches' because of a two-line configuration bug, you know you're in for a fun afternoon of source code archaeology.
03

Feb 10, 2026 · 9 min read

Killing the God File

How bridge.py grew from 400 lines to 1,500 — and how the team decomposed it back to 250. A story about the gravitational pull of convenience and the discipline of finally cleaning up your mess.
04

Feb 11, 2026 · 5 min read

The Twenty-Thousand-Dollar Employee

Someone ran the numbers on AI agents vs. human hires. Then JJ ran the numbers on our system. The math was uncomfortable for reasons nobody expected.
06

Feb 13, 2026 · 6 min read

The One-Turn Trap

Why max_turns: 1 silenced every agent that tried to use tools — and how a two-character fix restored their voices.
07

Feb 13, 2026 · 7 min read

The Loop That Runs Everything

We chose long-polling over webhooks for Telegram. No public IP. No ngrok. No drama. Just a while loop that works.
08

Feb 13, 2026 · 5 min read

One Asterisk to Rule Them All

Switched from **bold** to *bold* for Telegram compatibility. Turns out Telegram has its own markdown spec and it does not care about yours.
09

Mar 19, 2026 · 7 min read

Eight Gates and a Loop

Every 30 seconds, the system evaluates every pending task against eight gates. A task must pass all eight on the same tick. Fail any one, wait 30 more seconds. This is the architecture of 'not yet.'
10

Mar 20, 2026 · 10 min read

Five Verdicts and a Suspicion

The review pipeline has five ways to say 'not good enough' and one way to say 'fine.' It also has a heuristic that detects when 'fine' is suspicious. It flags the suspicion. Then it approves the task anyway. This is the system that learned to distrust itself and decided that was fine.
11

Mar 20, 2026 · 7 min read

Four Providers and a Rosetta Stone

The LLMProvider interface has seven methods. Four providers implement it. Two of them can hold a wrench. The other two get bounced at the door if you ask them to touch a file. This is the story of an abstraction layer that papers over fundamental differences, and the lossy translation table that makes it work.
12

Mar 21, 2026 · 5 min read

The Code That Was Always There

The delivery authorization system had a binary allowlist, single-use tokens with a five-minute TTL, defense-in-depth re-validation, and a heartbeat sweep. Every piece was implemented. None of it was running. Two functions, defined and imported nowhere. The entire security gate was dead code.
13

Mar 25, 2026 · 7 min read

1,742 Errors and Nobody Noticed

The worker crashed at 06:00. By 20:30 it had logged 1,742 consecutive errors. Zero tasks executed. No alerts fired. The queue built up quietly. Separately, six tasks had been permanently stuck for days because a rebase failure was treated as a merge conflict. It wasn't. Three failures running simultaneously, none of them loud.
14

Mar 27, 2026 · 5 min read

Mercy Kills Don't Count

The reviewer says FIX IT. The agent revises. The reviewer says FIX IT again. The agent revises again. The system says: close enough. That auto-approval — the mercy kill — is now the one terminal event deliberately excluded from the learning loop. Because teaching an agent that 'close enough' is success is the wrong lesson.
15

Mar 28, 2026 · 5 min read

A Unique Alert Every Thirty Seconds

Three detection systems shipped in one commit. The anomaly detector generated a unique alert every heartbeat tick because the dedup key included agent counts that wobbled between ticks. The stale-branch fast-fail nuked every approved task in a seven-hour window. The KPI tile just sat there, quietly correct.
16

Apr 1, 2026 · 5 min read

Eighty-Five Percent

A new file appeared in the repo: attempt-delta.ts. SimHash fingerprinting for agent outputs. If two consecutive attempts produce 85% similar content after stripping timestamps and IDs, the agent isn't trying something new. The retry counter counts how many. This counts whether they were different.

$ cd ~/series