🎙️ How I AI: Codex Goals explained & Claude Opus 4.8 review & Building an iPhone app with zero technical skills

Your weekly listens from How I AI, part of the Lenny’s Podcast Network

Lenny Rachitsky

Jun 01, 2026

Building an iPhone app with zero technical skills | Bryce Rattner Keithley

Listen now on
YouTube • Spotify • Apple Podcasts

Brought to you by:
WorkOS—Make your app enterprise-ready today
Metaview—The agentic recruiting platform for winning teams

Bryce Rattner Keithley spent her career in talent and recruiting and had never written a line of code. Then she used AI to build Daily Hundred, a fitness app with custom AI-generated videos of animals doing exercises, and shipped it to the App Store. In this episode, Bryce shares the exact workflow she used with Replit, Claude, Gemini, Higgsfield, and Kling; why being non-technical became an advantage; and what her journey reveals about how AI is changing who gets to build software.

Biggest takeaways:

You can build and ship a production iPhone app with zero technical background. Bryce spent her entire career in talent and recruiting, had never written code, and still managed to build Daily Hundred—a fitness app with custom AI-generated videos—and get it approved in the App Store. The entire process took a few months of weekend work.
The workflow that worked: Claude as architect, Claude Code as engineer, Terminal as executor. Bryce used regular Claude as her “friend in the cockpit” to plan what to do and how to approach problems. Claude would tell her when to use Claude Code to write actual code. She’d bring the code back to Claude for confirmation, then Claude would tell her what to paste into Terminal. This three-step dance—plan, execute, deploy—let her ship production code without having to know exactly how it all worked.
Screenshots and iteration are your best debugging tools. When AI wasn’t understanding what Bryce wanted, she’d either get more literal in her descriptions, completely restart the prompt (not just edit it), or send screenshots showing what she was seeing. Sometimes she’d even draw what she wanted or photograph her own starting position to give the AI a visual reference. The key was trying different approaches rather than getting stuck in one failed pattern.
The role of technical expertise is fundamentally changing. Bryce observed that engineers who come into technical interviews focused only on finding a working solution fastest are missing the point—“the robots can find a working solution faster than they can.” The human role has shifted to something broader: understanding the full suite of tools, knowing when to use AI versus when to step in personally, and bringing taste and judgment to the process. What got people here won’t get them there.
Hiring for adaptability and openness matters more than ever. In Bryce’s view, people who get territorial about what they used to do or what other people used to do will struggle with relevance. The winners will be those with “the humility and the curiosity to work with others in ways that you haven’t before” and who recognize that “people can contribute in ways that they haven’t before.” The best idea should win, regardless of where it comes from.

Blog & detailed workflow walkthroughs from this episode:

How I AI: Bryce Rattner Keithley’s No-Code Playbook for Building a Fitness App with Replit, Gemini, and Claude: https://www.chatprd.ai/how-i-ai/bryce-rattner-keithleys-no-code-ios-fitness-app-with-replit-gemini-and-claude

↳ Navigate the App Store Submission Process with Claude as a Technical Co-pilot: https://www.chatprd.ai/how-i-ai/workflows/navigate-the-app-store-submission-process-with-claude-as-a-technical-co-pilot

↳ Create Custom AI-Generated Animated Workout Videos with Gemini and Higgsfield: https://www.chatprd.ai/how-i-ai/workflows/create-custom-ai-generated-animated-workout-videos-with-gemini-and-higgsfield

↳ Build a Minimum Viable Product App with Replit Using No-Code ‘Vibe Coding’: https://www.chatprd.ai/how-i-ai/workflows/build-a-minimum-viable-product-app-with-replit-using-no-code-vibe-coding

The Codex feature that works while you sleep

Listen now on
YouTube • Spotify • Apple Podcasts

Brought to you by:
Mercury—Radically different banking loved by over 300K entrepreneurs

Claire Vo breaks down one of her favorite Codex features: /goal. In this solo episode, she shows how Goals turn AI from a tool you have to constantly babysit into an agent that can work for hours on multi-step tasks. She walks through real examples, including eliminating Sentry errors, cleaning nearly 4,000 emails, and organizing Linear tasks, and shares the six-part framework to write Goals that actually run.

Biggest takeaways:

Goals enable AI to work autonomously for hours without supervision. Claire ran a goal in Codex that worked for five hours and 45 minutes—the longest she’s ever had an AI agent run successfully. Unlike standard prompts that require turn-by-turn interaction, Goals create a loop where the AI works, verifies, checks, and continues until it hits the defined outcome.
The difference between a prompt and a Goal is fundamental. A prompt is an instruction of what to do (“Rewrite this code”). A Goal is a description of what a good outcome looks like and how to get there (“Reduce P95 checkout latency below a defined threshold while keeping the correctness suite green”).
Claire eliminated hundreds of error logs by pointing Goals at her Sentry data. She gave Codex access to every trace of invalid operations, then set a goal: categorize each issue, fix it, then replay all historical examples until every error is solved. The result: zero errors remaining, and instead of bandaid fixes scattered throughout the code, she got a systematic, intelligent framework.
Goals work incredibly well for non-technical tasks. Claire cleaned 3,900 emails down to 68 in under four hours by setting a simple goal: categorize all emails, unsubscribe from unnecessary ones, and clean up the inbox. The AI read every email, created labels, clicked unsubscribe links, and left her with only the emails requiring judgment.
Strong Goals have six key components: outcome (what should be true when done), verification (how to test it), constraints (what can’t regress), boundaries (what tools and files to use), iteration policy (how to decide what to try next), and stopping conditions (when to ask for help). Product managers who’ve written good OKRs will recognize this framework immediately.
Working with Goals feels like managing a colleague, not babysitting a tool. You assign a task, the AI goes away for the time required (whether that’s 30 minutes or five hours), and comes back with completed work for you to review. Claire found herself “twiddling her thumbs” because so much of the work was now handled autonomously.
Goals aren’t token-cheap, but they’re worth it. Claire’s email cleanup used about 6 million tokens over four hours. But the alternative—manually categorizing thousands of emails or chasing down hundreds of error logs—would take far longer and be far more tedious.

Claude Opus 4.8 is here. Is it as good as they say?

Listen now on
YouTube • Spotify • Apple Podcasts

Claire put Anthropic’s new Opus 4.8 model through real coding, design, and strategy tests across Claude Code and Claude Cowork. She shares where the model shines, where it breaks down, how it compares to Opus 4.7, and what builders should know before using it in production.

Biggest takeaways:

The voice and ergonomics are excellent. Opus 4.8 is easy to read, doesn’t have “slop tells,” is token-efficient, and feels conversational without being annoying. It talks enough but not too much, and with fast mode enabled, the experience is snappy. The writing quality is strong and the model follows instructions well.
Anthropic is shipping new features alongside Opus 4.8 that expand agentic capabilities. Claude Code now has dynamic workflows that let you spin off hundreds of parallel sub-agents. Both Claude.ai and Cowork now offer effort control from low to max, giving users more control over how deeply the model thinks through problems.
Use Opus 4.8 for greenfield prototypes and design work, but test carefully for production codebases. The model excels at one-shot features, has improved design aesthetics (no more italicized emphasis words), and is good at tool use. But for existing codebases, edge cases, and strategy work requiring numerical analysis, you’ll need careful prompting and should double-check anywhere the model expresses high confidence.
The model hallucinates when it gets stuck, which is a significant regression. Claire experienced straight-up hallucinations multiple times—something she hadn’t seen in a very long time with modern models. When debugging, Opus 4.8 would make up explanations based on hypotheses rather than actual data. It would confidently state things like “No, I didn’t search GitHub” or “No, I didn’t actually validate that bug” when asked to verify its work.
Opus 4.8 struggles to orient itself in existing codebases. When Claire asked it to rebase branches and fix conflicts in her production codebase, it required cycle after cycle of fixes because it kept shipping edge-case bugs. The model couldn’t understand the elevation at which it should be operating or how to properly insert itself into existing code.
The model isn’t ambitious enough for truly agentic work. Claire asked it to suggest fun things to build that would impress a 9-year-old, pushing it to explore the edges of agentic coding. While it shipped working code, the results were serviceable but not impressive—not the 10x agentic coding experience she expected from a state-of-the-art model.
For business strategy work, Opus 4.7 significantly outperforms Opus 4.8. Claire tested both models on the same strategy prompt, giving them access to three months of business context. Opus 4.7 delivered numbers-anchored, structured analysis rooted in real data. Opus 4.8 was hand-wavy, over-rotated on small data points, and had a harder time discovering relevant information.

Blog & detailed workflow walkthroughs from this episode:

How I AI: My First Impressions of Claude Opus 4.8 – Coding, Strategy, and Where It Shines: https://www.chatprd.ai/how-i-ai/claude-opus-4-8-review

↳ Use Claude Opus 4.8’s Creativity to Generate a Playable Game: https://www.chatprd.ai/how-i-ai/workflows/use-claude-opus-4-8-s-creativity-to-generate-a-playable-game

↳ Generate a Data-Driven Business Strategy with Claude Opus 4.7: https://www.chatprd.ai/how-i-ai/workflows/generate-a-data-driven-business-strategy-with-claude-opus-4-7

↳ Build a Greenfield Prototype with a Single Prompt Using Claude Opus 4.8: https://www.chatprd.ai/how-i-ai/workflows/build-a-greenfield-prototype-with-a-single-prompt-using-claude-opus-4-8

If you’re enjoying these episodes, reply and let me know what you’d love to learn more about: AI workflows, hiring, growth, product strategy—anything.

Catch you next week,
Lenny

P.S. Want every new episode delivered the moment it drops? Hit “Follow” on your favorite podcast app.

Discussion about this post

I also found Opus 4.8 hallucinating terribly.

I asked both Opus 4.8 and GPT 5.5 to search for specific listings and give me their review of findings. And Opus made conclusions and recommendations based on hallucinations and outdated information.

And when i followed up with this issue it tried to gaslight me by saying that GPT was incorrect and the original info it hallucinated really did exist. I pushed back a third time and it capitulated but claimed that the listing information is dynamic and these listings are published then taken down often so Opus was actually right overall but in this specific instance it looks I was correct as right now it's not live. 🫠

It's like working with a stubborn arrogant coworker who always has to be right and get the last word in. 😆

Its been a very long time since I've experienced anything this bad too.

Mark S. Carroll

What I appreciate here is that this does not just report on AI tools. It quietly shows the shift from tool use to workflow design.

The most interesting pattern across all three segments is not “AI can do impressive things.” It is that the real leverage starts showing up when people stop treating AI like a chatbot and start treating it more like an operator, a collaborator, or a system with boundaries, goals, and handoff points.

That is the bigger story to me. Not just better outputs. Better orchestration.

No posts

Ready for more?

#nojs-banner { position: fixed; bottom: 0; left: 0; padding: 16px 16px 16px 32px; width: 100%; box-sizing: border-box; background: red; color: white; font-family: -apple-system, "Segoe UI", Roboto, Helvetica, Arial, sans-serif, "Apple Color Emoji", "Segoe UI Emoji", "Segoe UI Symbol"; font-size: 13px; line-height: 13px; } #nojs-banner a { color: inherit; text-decoration: underline; } This site requires JavaScript to run correctly. Please turn on JavaScript or unblock scripts