đď¸ How I AI: GLM-5.2 review & How Gusto built a new product line with Claude Code
Your weekly listens from How I AI, part of the Lennyâs Podcast Network
GLM-5.2: why Iâm replacing Opus in Claude Code with this new model
Listen now on YouTube ⢠Spotify ⢠Apple Podcasts
Brought to you by:
MercuryâRadically different banking, loved by over 300K entrepreneurs
Claire tests GLM-5.2, the new open-weight model from Z.ai, inside her actual ChatPRD codebase. She runs it through codebase audits, UI redesigns, and a 45-minute autonomous bug-hunting task in Cursor and Claude Code, and breaks down where it surprised her, where it struggled, and why it may be good enough to replace Opus for some coding workflows.
Biggest takeaways:
Open-weight models are no longer a hobbyist curiosityâthey are production-grade alternatives. GLM-5.2, built by Beijing-based Z.ai, benchmarks near Claude Opus 4.8 and above GPT-5.5 on SWE Bench Pro, with a million-token context window and full support for reasoning mode, function calling, structured output, and context caching. The decision is no longer about capability ceilings but, instead, about cost, control, and vendor dependency. Claireâs live testing confirmed it: this is not a toy.
Self-hosting changes the vendor power dynamic in ways that matter at scale. Open-weight means the trained model weights are publicly available, letting teams run inference on their own hardware, fine-tune on proprietary data, and route around any single providerâs API terms. When frontier labs change pricing or policy, teams using open-weight models can switch inference providers without touching a line of application code. The key: youâre not locked in.
Getting GLM-5.2 running in Cursor took 30 minutes, and Claire documented the undocumented part. Route your API key through Open Router, override the OpenAI base URL in Cursorâs settings to
openrouter.ai/api/v1/cursor(the/cursorsuffix isnât documented anywhere), and addz-ai/glm-5.2as a custom model. Claude Code requires two environment variable changes and one edit toclaude/settings.json. Total time: under an hour, once you have the exact strings.The 45-minute autonomous task revealed both the ceiling and the floor. Claire gave GLM-5.2 a single prompt inside Claude Code: pull the last 72 hours of Sentry errors and Vercel logs, then build a prioritized bug-fix plan. Over 45 minutes, it ran MCP tool calls, authenticated into external services, and produced a dark-mode engineering canvas with 20 Sentry errors, five Vercel log signals, and 14 planned fixes, including two P0s Claire hadnât spotted through normal monitoring. The model surfaced signal-to-noise issues in their error pipeline that werenât showing up elsewhere.
It hit a wall with React, then recovered. During the long-running task, GLM-5.2 struggled with TypeScript compilation errors before eventually producing clean React output. Claireâs read: HTML and CSS generation is reliable; React under agentic, multi-step pressure is shakier. For teams whose codebase is primarily React (she estimates it covers 98% of her own use), this is the friction point to test before committing the model to critical paths.
The cost math is striking: $3.36 for 6 million tokens, including the full 45-minute agentic session. A 72% cache rate helped, but even at full price, open-weight inference through Open Router sits well below Opus or GPT-5.5 rates for equivalent coding capability. For agents accumulating long context windows over extended sessions (the exact workload where frontier model costs compound fastest), open-weight alternatives offer a structurally different cost curve.
Claireâs recommendation: put GLM-5.2 in rotation, not in the spotlight. Sheâs keeping it in Cursor for frontend and design work, and in Claude Code for long-running agentic tasks, alongside closed frontier models rather than as a replacement. The constraint sheâs watching: can it handle her React-heavy workload at the same consistency she gets from Composer? If it can, the cost-and-control argument gets much harder to ignore.
Blog and detailed workflow walkthroughs from this episode:
GLM 5.2: A Live Review of an Opus-Level Open-Weights Model: https://www.chatprd.ai/how-i-ai/glm-5-2-review-open-weights-model
âł How to Deploy an Autonomous AI Agent for Bug Triage and Prioritization: https://www.chatprd.ai/how-i-ai/workflows/how-to-deploy-an-autonomous-ai-agent-for-bug-triage-and-prioritization
âł How to Perform an AI-Powered Codebase Audit and Architecture Visualization: https://www.chatprd.ai/how-i-ai/workflows/how-to-perform-an-ai-powered-codebase-audit-and-architecture-visualization
âł How to Configure the Open-Weight GLM 5.2 Model in Cursor: https://www.chatprd.ai/how-i-ai/workflows/how-to-configure-the-open-weight-glm-5-2-model-in-cursor
No Figma. No Jira. No docs. How Gusto built a new product line with Claude Code | Eddie Kim (CTO)
Listen now on YouTube ⢠Spotify ⢠Apple Podcasts
Brought to you by:
Magic PatternsâPrototypes that look like your product
Jira Product DiscoveryâPrioritize with insights, build with confidence
Eddie Kim is the co-founder and CTO of Gusto. In this episode, he shares how a five-person team used Claude Code, a permanent Zoom room, and almost none of the usual product processâno PM, no Figma, no Jira, no long specsâto build Gusto Cofounder from scratch in just 10 weeks.
Biggest takeaways:
A five-person team with no process can outship a large team with full process, if AI handles the engineering. Eddieâs product launched at Gustoâs tier-one level after 10 weeks, starting from zero code. The constraint wasnât a liabilityâit was the design. When AI does the building, coordination overhead doesnât scale the engineering; it just slows it down. The key: strip process to what the team actually needs, then let AI fill the gap.
âZero code to tier-one launchâ is now a viable founding path. The team reached a production milestone at Gusto without a line of pre-existing code. This flips the assumption that early teams spend months on infrastructure before shipping anything real. With Claude Code as the primary builder, the initial sprint becomes about direction and judgment, not typing. It compresses the time between idea validation and real user contact from months to weeks.
No meetings, no Jira, no text threads. It shipped anyway. The team had no standup cadence, no ticket system, no async thread to resolve blockers. What replaced all of that: shared context held inside the AI loop. When the model carries state and the team is small and aligned, human coordination overhead becomes optional.
The technical stack for a production AI agent is shockingly minimal. The entire agent loop ran on Cloudflare Workers with the Vercel AI SDK. Nothing else. No proprietary orchestration layer, no third-party agent framework. Everything else was built in-house. Teams often over-architect before theyâve proven anything; Eddieâs stack is evidence that infrastructure minimalism accelerates the path to learning what the agent actually needs to do.
Building agents is not as complicated as the community makes it sound. An agent is an AI SDK running somewhere in the cloud, able to look up files and call tools. Thatâs the full definition. The complexity people fear (state management, orchestration, reliability) is solvable with the same judgment calls any backend system requires. Eddieâs team shipped one at production quality in 10 weeks without specialist AI infrastructure experience.
The âpermanent Zoomâ model of AI development changes how teams think about context. Claude Code running in a persistent loop means the model has continuous access to the codebaseâs current state. Thatâs closer to having an engineer who never closes their laptop than a chat interface you query on demand. For small teams, this is the equivalent of a senior engineer who is always available, always current, and never needs onboarding after a break.
The lesson for founding teams isnât âuse Claude Code.â Itâs âdesign your process for AI as a team member.â Most early teams graft AI tools onto a human-scaled workflow: standups, tickets, PRs reviewed by three people. Eddieâs team treated the AI as a primary contributor from day one and built their coordination model around that assumption. The result: a workflow that gets faster as the AI improves, not one that merely offloads tasks to it.
Blog and detailed workflow walkthroughs from this episode:
How Gusto Built a New Product Line in 10 Weeks with Claude Code, No Jira, and No Docs: https://www.chatprd.ai/how-i-ai/how-gusto-built-a-new-product-line-in-10-weeks-with-claude-code-no-jira-and-no-docs
âł How to Build a New AI Product in 10 Weeks Using the âNo-Processâ Method: https://www.chatprd.ai/how-i-ai/workflows/how-to-build-a-new-ai-product-in-10-weeks-using-the-no-process-method
âł How to Fix Bugs Using an AI-Powered Test-Driven Development (TDD) Workflow: https://www.chatprd.ai/how-i-ai/workflows/how-to-fix-bugs-using-an-ai-powered-test-driven-development-tdd-workflow
If youâre enjoying these episodes, reply and let me know what youâd love to learn more about: AI workflows, hiring, growth, product strategyâanything.
Catch you next week,
Lenny
P.S. Want every new episode delivered the moment it drops? Hit âFollowâ on your favorite podcast app.





