Tag
#ai-engineering
6 pieces of content
I Evaluated Fine-Tuning Across 3 Projects — None of Them Needed It
Three projects, three evaluations, zero cases where fine-tuning was justified. Here's the decision framework, the cost math, and why simpler approaches won every time.
How ReAct Agents Recover from Their Own Mistakes
ReAct agents recover from their own mistakes — not because the model is clever, but because of how tools return errors and how the loop is structured. Here's what that looks like in practice.
Letting the Model Pick Its Own Tools: How Tool Use Inverts Control Flow
The model autonomously combined keyword search and vector search in the optimal sequence — without being told to. Then I ran experiments to measure what vague descriptions, over-calling, and temperature actually do to tool selection.
When Embeddings Fail: Why Vector Search Can't Judge Capability
I added vector search to the screening pipeline and watched it rank a junior frontend developer above a Principal Engineer who processed 1B+ events/day. The embedding model matched vocabulary, not capability.
My LLM Pipeline Passed Every Manual Check — Then 36 Tests Proved Otherwise
Five manual runs looked fine. Then 36 automated tests exposed non-deterministic sourcing, biased scoring, and a confidence threshold that fired randomly.
Auditing My AI Systems: Patterns, Tradeoffs, and Gaps I Was Working Around
I catalogued every AI decision across three production systems and found a consistent pattern — along with five gaps I'd been working around instead of solving.