Mercy Kills Don't Count
The reviewer says FIX IT. The agent revises. The reviewer says FIX IT again. The agent revises again. The system says: close enough. That auto-approval — the mercy kill — is now the one terminal event deliberately excluded from the learning loop. Because teaching an agent that 'close enough' is success is the wrong lesson.