Chat for Verbs, Menu for Nouns — Essential Product Management Skills for AI Products
Chat as a UI element isn't new. We've had support chatbots, IM clients, and customer-service flows for decades. What's new in this cycle is chat as the primary surface for serious work — with a language model doing the routing and a harness orchestrating the work underneath.
That novelty is creating a quiet argument in product teams everywhere. I've been sitting in product reviews for most of the last two years, and the same tension keeps surfacing: should the next surface we ship be a chat interface, or a menu? Which one do users actually want? Which one will they actually use?
I want to suggest the question is misframed.
The argument worth having isn't chat versus menu. It's which interface, for which task, with what guarantee of correctness, and on whose surface. Four parts. Most of the product debates I see only touch two of them.
The trap most discussions fall into
We've all met the two failure modes.
Pure chat — a ChatGPT-style blank text field with a cursor. Maximum flexibility, minimum discoverability. Power users thrive. Most everyone else stares at the box and asks "what can you do?"
Pure menu — Salesforce, SAP, NetSuite. Total control, brutal cognitive load. Implementation timelines measured in months. User training as a service line. Even power users complain.
Both have real costs. Both have coexisted for twenty years because each is right for some workflows and wrong for others. The shift in this cycle isn't that one approach wins — it's that PMs now face four interlocking design questions where they used to face two. I think the products that work in the next decade will be the ones whose builders answered all four deliberately.
1. The Surface PMs Sometimes Forget Exists
The first decision in AI-era product design isn't "chat or menu." It's whether a dedicated UI is necessary at all.
For a large class of workflows, the right product has no app for the user to open. The user already lives in their inbox, their spreadsheet, their Teams channel, their WhatsApp thread. The agent shows up there. An email arrives, the agent reads it, takes action, replies, escalates when unsure. The user never logs into anything new.
This isn't a chat interface. It isn't a menu interface. It's no interface at all, in the conventional product sense. The harness lives where the user already lives.
Two observations follow.
The first: the best agent UI is sometimes no agent UI at all. The surface is the channel the user is already on — email, Teams, the shared spreadsheet, the project channel. Adding "log into our portal" is friction the AI era doesn't require for a meaningful set of jobs.
The second, and more interesting one to me: this isn't a workflow tool. It's embedded information architecture. The harness becomes the IA of channels you don't own — your customer's inbox, your operator's spreadsheet, the team's Slack. Designing for it is fundamentally different from designing for surfaces you control.
PMs trained on app design — onboarding flows, dashboards, settings pages — often miss this option. We reach for "the app we're building" before we've asked whether the app should exist.
The better question, I'd argue, isn't what should our app look like. It's where does the user already work, and what's the smallest possible intrusion into that workspace?
2. Chat for Verbs, Menu for Nouns
When a dedicated UI is needed, the second decision is which work the chat does and which work the menu does. This is where I think the deepest design mistake hides.
The asymmetry is this:
People articulate actions clearly. They know what they want to do. "Send this to Jane." "Approve the discount." "Mark these as resolved." "Reply with X." The language is verbal, imperative, instrumented. Chat is a good surface for verbs because the user can express the verb directly.
People do not articulate views clearly. They don't know what they want to see. They don't know what reports exist, what fields are tracked, what slices of the data are available. The language for nouns is unsteady: "show me the thing about… revenue? Last quarter? Maybe by region? Do we have that?" Menu is a good surface for nouns because the user needs help discovering what can be shown.
Pure chat struggles on the noun side. Asking a user staring at a blank chat box to articulate a dashboard is asking them to do schema work they're not trained for.
Pure menu struggles on the verb side. Encoding every possible action as a button produces interfaces with thousands of buttons. The user spends the day hunting for the right one. That cognitive load is the product's bill, not its bonus.
The synthesis I keep returning to:
Menu for nouns. Chat for verbs. The menu surfaces the available views — entities, reports, dashboards, lists. Once the user is on the right surface, the chat operates on the data shown. They see the customer list and say "email the top three." They see the invoice and say "approve and route to finance."
The menu solves discovery. The chat solves operation. Each does what the other can't.
This isn't a compromise between two failure modes. It's the design that respects how human cognition actually works. People don't know what they want to see. They know what they want to do.
3. The Accuracy Problem
Here is the question I think almost no PM is honestly answering: how accurate does the chat have to be, and how do you guarantee that accuracy?
In a menu or form interface, correctness is guaranteed by construction. The Save button saves. The form validates inputs before submitting. The dropdown contains only valid options. The state machine refuses invalid transitions. The product is correct because the surface was designed to be correct. There is no inference layer between intent and effect.
In a chat interface, every action travels through inference. The model interprets intent, picks a tool, fills parameters, executes. Each step has a failure probability. The compounding error is the product's correctness problem.
This is the part of AI product design I see PMs avoid, because it feels like an AI-engineering problem rather than a product problem. I'd push back gently. The correctness budget is a product specification, not a model property. If your product is "approve invoices via chat" and the model approves the wrong invoice once per thousand turns, you have a one-in-a-thousand failure rate that no amount of UI polish recovers.
Accuracy in chat isn't a feature you add. It's a property of the harness architecture surrounding the model.
The PMs who understand this will ship chat-driven products that hold up. The ones who don't will ship products that demo beautifully and break under load. I've watched both happen — the demo-day winners are not always the year-two winners.
That leads to the fourth question — the one I think is most worth your time.
4. What the Leak Revealed
In March 2026, Anthropic shipped Claude Code v2.1.88 to npm with a source-map file accidentally attached. For about a day, before the takedowns, the harness internals were publicly readable. The interesting thing about the leak — to me at least — wasn't the security story. It was that the artifact functioned as a design document. The most coherent public statement I've yet seen of how to build correctness into a chat-driven product.
The patterns, restated as the PM-design language they actually are:
Skills. Each skill is a folder containing a description, an instruction set, and a tool list. The description is the only thing the model reads at routing time. The description is the menu the model sees. Skills aren't a function-call abstraction. They're named, bounded, invocable capabilities, written so the model knows which one to pick.
Hooks. PreToolUse, PostToolUse, Stop. The harness fires events around every action. Hooks can intercept, validate, block. This is the validation layer that makes chat correctness possible. The form had its onSubmit handler. The harness has its PreToolUse. Same role, different surface.
Sub-agents. A parent agent can spawn a child with a restricted tool set and an isolated context. The child does one thing, returns a summary, exits. This is task-scoped permissions, replacing role-based access control. Authority is no longer about who the user is. It's about which sub-agent has which tools.
Memory tiers. HOT (always loaded), WARM (loaded on demand), COLD (archived, searchable). The user profile and the settings page collapsed into one context-discipline model. What's always remembered. What's recalled when relevant. What's filed away.
TodoWrite. A tool the agent uses to write progress into a visible list. Ambient transparency — the user sees what's being worked on without having to ask. The progress bar of the chat era.
<system-reminder> injection. The harness can insert reminders mid-conversation — "you haven't used TodoWrite recently," "context is at 80%." Behavioral nudges without user action. Quiet steering.
Compaction. When context fills, a summarization pass runs and compresses earlier turns. The user never sees "session expired." The conversation degrades gracefully rather than failing.
Look at this list. None of these are AI infrastructure in the narrow sense. Every single one of them is a user-experience pattern, made invisible by the fact that the surface is a chat box.
Which brings me to what I think is the most important observation in this entire essay:
The simpler the UI, the more complex the product design must be.
A chat box looks trivial. One input field, one send button. The complexity hasn't gone away. It's gone underneath. What used to be 700 settings, 47 form validators, 12 permission roles, and a 3,000-line state machine has become a harness composed of skills, hooks, sub-agents, memory tiers, and quiet behavioral injections. The user sees the chat box. The PM has to design everything below it.
That's the gift the leak gave the PM community — not infrastructure, but a working reference for the UX of chat-driven products. Anthropic also documents most of these patterns publicly. The leak just made them impossible to ignore.
What this means for the PM role
Three concrete shifts, from where I sit:
Roadmap becomes skill registry curation. The unit of product planning shifts from "ship feature X by Q3" to "ship the named skill X with its description, allowed callers, and invocation triggers." Skills are the new features.
Settings becomes memory tier policy. Half of what used to be a settings page becomes a policy decision about what belongs in HOT, WARM, or COLD memory. Surface less. Infer more. Override sparingly.
Permissions become sub-agent scoping. Role-based access control loses some of its primacy. The question is no longer only "who can do this." It's which sub-agent has the tools to do this, and how do we scope it.
I'm watching PMs who design for the right surface — sometimes no surface at all — out-ship PMs still designing app-shaped products for problems that don't need apps. The work hasn't disappeared. It's moved.
What's coming
In five years, I think the dominant question PMs will ask in design reviews won't be "what does the flow look like." It will be some version of:
- Which surface does this work belong on?
- What verb is the user expressing, and what noun are we surfacing for them?
- What hooks guarantee the accuracy of the action?
- Which sub-agent has the scope to execute it?
The PM hires AI-era startups make in 2027 and beyond won't be evaluated on funnel-optimization experience from a 2018 SaaS playbook. They'll be evaluated on whether they can design a skill registry, articulate a memory-tier policy, and explain how a sub-agent should be scoped for a given task.
The category of PM hasn't changed. The questions that define seniority have.
The four questions in this essay — which surface, which division of labor, what accuracy guarantee, what harness primitives — aren't exhaustive. They're four I'd offer as a starting point for the skill set the AI era will reward. Each represents a kind of judgment AI-era PMs will need to develop: choosing the right surface for a workflow, dividing labor between chat and menu, designing accuracy into chat-driven actions, and understanding the harness primitives that hold the system together. The sooner we bring these questions into our own design reviews and our own work, the more ready we'll be when they become the table-stakes vocabulary of the role.
The simpler the UI, the more complex the product design.
The chat box alone was never the product.
Reboot business.
Original frameworks for thinking in the AI era.
Read more
Business Engineering: Business-as-Code (BaC) — The Ultimate Automation Through Agent Orchestration
A research paper on the feasibility of AI-era Full-Stack Business Engineering. Every business is a codebase. SOPs are source code. AI agents are the runtime. This paper introduces Business Engineering as a formal discipline and Business-as-Code as its foundational methodology.
Can Due Diligence Be Computed? Capital Allocation After Knowledge Scarcity
As AI collapses the cost of evaluation and execution, the foundations of venture capital and institutional investing come under pressure. What remains scarce when judgment becomes computational?
From Profiling to Recommendations: The Shift From Attention to Memory in the Age of AI
As AI agents replace search and feeds with delegated decision-making, digital commerce shifts from profiling attention to owning memory. This transition rewrites incentives, concentrates power, and raises unresolved questions about trust, autonomy, and governance.