Writing Updated weekly · 24 posts

Field notes
from production.

Teardowns of systems we've built. Opinions on where the AI consulting space is wrong. The occasional rant. No "what is an agent" listicles.

Topics Subscribe →
eval/v3 PASS 94.2% "It works on my prompt" — famous last words + 600 evals + regressions caught: 14 + time to bug: 3min
Teardown Engineering

Why your "AI prototype" feels broken in production — and what we did about it.

A 4,200-word walk-through of the eval suite we built for a fintech client — 600 cases, weighted toward the edge ones, run on every commit. It's the unsexy part of AI work that nobody writes about because it isn't a model launch.

14 May 2026 · 12 min read · Read →
"We don't sell platforms"
Opinion

The "AI platform" is the new "blockchain platform". Don't build on it.

7 May · 6 min
if (confidence < 0.8) { escalate(); }
Engineering

How we design agent confidence thresholds (and why one number isn't enough).

28 Apr · 9 min
FICA · POPIA · PEP · SoF
Industry · Fintech

AI for fintech compliance: what's possible, what's regulator-suicide.

21 Apr · 14 min
Teardown

From 18 hours to 12 minutes: an affiliate lead pipeline, in 12 commits.

14 Apr · 11 min
RAG ≠ search.
Search ≠ retrieval.
Engineering

Most RAG systems are just slower search. Here's the fix.

7 Apr · 8 min
$ npm i @your-life-savings
Opinion

Why we still write our own glue, even when there's a SaaS for it.

31 Mar · 5 min
SaaS ops · 41% auto-resolved
Teardown

A support agent that handles 41% of tickets — and knows when to shut up.

24 Mar · 10 min
"Just one more model"
Opinion

Don't add a model to your stack just because it scored 2 points higher.

17 Mar · 6 min
SELECT * FROM honesty
Industry · SaaS

AI for B2B SaaS: stop replacing humans, start replacing tickets.

10 Mar · 7 min
Load more posts
Newsletter

New posts. No noise.

Twice a month, max. Long-form teardowns and the occasional opinion. Unsubscribe with one click — we won't act offended.