If you’re a premium subscriber
Add the private feed to your podcast app at add.lennysreads.com
In this episode, we dive into the fast-emerging discipline of AI evaluation with Hamel Husain and Shreya Shankar, creators of AI Evals for Engineers & PMs, the #1 highest-grossing course on Maven.
After training 2000+ PMs and engineers across 500+ companies, Hamel and Shreya reveal the complete playbook for building evaluations that actually improve your AI product: moving beyond vanity dashboards, to a system that drives continuous improvement.
In this episode, you’ll learn:
Why most AI eval dashboards fail to deliver real product improvements
How to use error analysis to uncover your product’s most critical failure modes
The role of a “principal domain expert” in setting a consistent quality bar
Techniques for transforming messy error notes into a clean taxonomy of failures
When to use code-based checks vs. LLM-as-a-judge evaluators
How to build trust in your eva…