Operable Software
The last twenty years taught the machine to expose everything it can do. The one thing still un-named is the way a human gets in.
Last week I was inside Cloudflare, looking at my own traffic, and I knew exactly what I wanted.
One IP from the Netherlands had thrown 675 requests at my site in a single spike around 5am. The user agent called itself audit scanner or something. It was walking down a list of filenames - .env, .env.local, .env.backup, .env.docker - the way burglars try every door on a street. Classic credential-scanning. I wanted it gone. I wanted to understand a second IP, an Amazon one, that was quieter but odd. And I wanted to stop the pattern, not just the address, because scanners rotate IPs and blocking one is theatre.
I knew all of that. What I didn’t know was where Cloudflare kept the controls, or what Cloudflare called any of it.
So I did the thing all of us now do without noticing how strange it is. I left the product to learn how to use the product. I screenshotted my own dashboard and pasted it into an AI so it could see what I was already looking at. I narrated the numbers out loud to a machine. It wrote me a WAF rule. I copied that rule, clicked back into Cloudflare, hunted down Security → WAF → Custom Rules → Create, and pasted it into the box. Then I went back to the chat to ask about the second IP, and did the whole ferrying dance again.
I left the product to learn how to use the product.
The most powerful software I touch every day can be navigated, but it cannot be talked to.
That small absurdity sent me down a three-day rabbit hole. I came out the other side with a thesis, a flag, and the slightly deflating discovery that I was both completely right and not remotely early. This is the whole map.
Two wishes wearing one coat
“I wish I could just talk to Cloudflare.” I said it to myself in frustration, and only later realised it was two wishes pretending to be one.
The first is the wish to talk to the company. Why is it built this way? What do you call this thing? Where does this concept live? That’s a knowledge wish - messy, conversational, and mostly low-stakes. If the answer is wrong, I’m mildly misinformed.
The second is the wish to operate the product. Change this. Stage that. Block this pattern but don’t deploy it to production yet. That’s an action wish - and it’s a different animal entirely. Different layer, different owner, different risk. If the action is wrong, I’ve just blocked legitimate traffic to a live site, or worse.
They sound adjacent. They are not. The knowledge half is where the chatbots already crowd. The action half is where both the prize and the danger live, and it’s the half almost nobody is building honestly. So this essay is about the action half.
The wish to operate software by speaking to it, inside the software itself.
To see why that wish has gone unanswered for so long, you have to look at how software has actually evolved. Because the answer turns out to be the last unfinished step in a pattern that’s been running for twenty years.
The arc nobody finished
Strip away the framework wars and the funding cycles, and the last two decades of web software did one thing, over and over, with monastic consistency: it externalized what software can do, and gave each thing a name.
Once, the server held everything and spat out dumb HTML. The browser was a dumb terminal. Then the layers began to tear loose, one at a time, each becoming addressable in its own right.
First the View broke free and moved into the browser. And React did something quietly profound with it: it made the interface a function of state. You stopped hand-building the screen and started declaring what the screen should be for any given state, letting the machine reconcile the difference. The View became derivable rather than constructed.
Then the back half tore loose. REST, then GraphQL, then headless everything - the backend stopped being welded to one frontend and became a set of named capabilities any client could call. A vocabulary of verbs, sitting on a wire.
Then, in late 2024, MCP - the Model Context Protocol. The same move, one more turn of the screw: those named capabilities became consumable by AI, not just by human-built frontends. The verbs detached from the consumer entirely.
And here is the beat almost everyone misses, because it happened inside the apps where nobody outside could see it. Years before MCP, Flux and Redux quietly named intent itself. In a modern app you don’t mutate state directly - you dispatch a named, serializable action: { type: ‘CREATE_WAF_RULE’, payload: ... }, and a pure function maps the old state plus that action to the new one. Sit with what that means. Every button in Cloudflare’s dashboard is, underneath, a thing that fires a named intent. The application already speaks intent fluently. It has for a decade.
So count them. Data - named. Views - named and derivable. Capabilities - named and callable. Intents - named and serializable.
Everything in modern software is named except the human’s way in.
That’s the whole gap. The app named its data, its views, its capabilities, even its internal intents - and the human is still standing on the doorstep, hunting through menus for the gesture that fires the intent they already know they want. Every layer got a vocabulary except the one where a person expresses what they’re trying to do. That is the last decoupling. It’s the one slot the twenty-year pattern left empty, and it’s been sitting there, waiting, the whole time.
The carrier was already on the page
When I first imagined filling that slot, I reached for something new - a command palette, a clever keystroke, some fresh surface to summon. I was wrong, and the reason I was wrong is the most important design decision in this entire idea.
The trap is the chatbot in the corner. We’ve all met it. It is socially coded as help, support, apology - the place you go when you’re stuck, staffed by something that mostly wants to deflect you to documentation. Nobody operates serious infrastructure through the help bubble.
But there is already a surface on the page that means exactly the right thing. The search bar. It’s coded for command, navigation, operation - you type into it and the product takes you somewhere. It is, in fact, the web’s oldest instinct. The address bar was the original “name a state and jump straight to it” - addressable state before we had the vocabulary for it. Site search was a scoped address bar. The web has reached for the same gesture its entire life: a bar where you say where you want to be, and the system takes you there.
It just couldn’t understand sentences. And it could only go, never do.
So the move isn’t a new surface. It’s the bar everyone already has, finally able to understand language instead of keywords, finally able to act and not merely navigate - and crucially, one that doesn’t empty itself and close after a single command. It persists as a thread. You type “what is this IP doing,” the page answers, and the bar is still live for “block it” without you re-explaining what “it” is. Search stops being fire-and-forget and becomes a conversation that happens to live where search already lives.
Here’s the architectural placement, and it’s the sentence I’d ask a builder to remember: this is the first real reimagining of the MVC Controller in twenty years. The Model and View have been reinvented half a dozen times each. The Controller - the C, the piece that takes input and translates it into changes - has barely moved since “event handler.” What the language bar does is evolve it from a click-interpreter into an intent-interpreter. Same job it always had: input in, model-and-view changes out. But the input is now language, and the translation is computed at runtime instead of hand-wired at build time.
And one correction, because the loudest voices in AI get this backwards. The interface does not die. The dream isn’t a chat window that swallows the app. The UI becomes the response surface. You speak the intent; the product itself is the reply. Ask for “analytics for abc.com” and the answer isn’t a paragraph telling you where analytics lives - the product moves there, selects the zone, sets a sane range, and shows you. Language is the input. The product moving is the output. Read intents transform the page. Write intents stage a change and wait for your hand.
That last distinction - read versus write - is where the real product lives, and where the real danger lives too. Hold onto it.
Framework, or protocol?
If you believe a new layer is coming, the next question is what shape it arrives in. A framework you adopt and build inside, the way teams adopted React? Or a protocol you conform to and stay free, the way the web conformed to REST?
History is unusually clear here. Paradigms need something holdable before anyone believes the idea. REST was named and codified after people had already lived years of web-API pain - the standard described a reality, it didn’t predict one. React shipped as a working thing developers could touch long before “the component model” became doctrine. The sequence that works is demo → framework → protocol. You make people feel the behavior, then you package the repeatable parts, and only then - once the patterns have stopped moving - do you harden them into a standard.
Run it the other way and you get a graveyard. Author a protocol before the behavior is felt and you spend two years in a working group arguing about schema fields while someone less careful ships the ugly version and wins the category. Protocol-first is how elegant ideas die respectable deaths.
So my bet was: this starts as an experience, becomes a framework, and earns its protocol last.
I was confident about that bet. Then I went and checked who else was already standing on this hill. The answer reorganized everything.
What the survey revealed
I thought I was early. I was not. I want to tell you exactly what I found, because the map is more useful than my ego.
Someone already built the search bar. A company called CommandBar spent years building precisely this: an embeddable, in-app natural-language bar, with a feature that walked users through the interface by effectively taking control of the mouse. Their customers were infrastructure and developer tools - HashiCorp, LaunchDarkly - exactly the punishing, high-consequence products where I’d argue this matters most. They were acquired by Amplitude in late 2024 for a modest sum and folded into an analytics and onboarding suite. And here is the detail I keep turning over: they were absorbed the month before MCP existed. Everything they did, they hand-wired per customer, because there was no shared grammar of capability to consume. The substrate that would have made their work cheap arrived right after they sold.
Someone already built the framework. A company called CopilotKit raised a Series A this year on an architecture that is - almost line for line - the one I’d reasoned my way to from scratch without knowing they existed. A hook to make the app’s live state readable to the model. A hook to register the app’s existing actions as things the model can call. And a “render-and-wait” pattern that pauses for explicit human consent before any irreversible action - my read/write seam, already shipped. I had reconstructed a funded company’s stack from first principles in my head. That stung and thrilled me in equal measure.
The protocol seats are taken by giants. There is a genuine flurry of standards - MCP Apps, made official in early 2026 and backed by both Anthropic and OpenAI; Google’s A2UI; CopilotKit’s own AG-UI; Microsoft’s NLWeb, built by the creator of RSS and Schema.org and explicitly pitched as “HTML for the agentic web.” A solo operator does not get a seat at that table.
And the academy has been here for a while, filing the whole thing under “GUI agents” - systems that perceive an interface, plan, and act. One survey even gave my central anxiety a name: the execution gap, the danger that arises when an agent performs irreversible operations - submitting forms, granting permissions, deleting data - in an interface it only partially understands.
So: the frontier is crowded. The framework is funded, the protocol is being set by the four biggest companies in software, and the researchers have a head start. If you came here for “lone founder spots the future,” I’m sorry. That’s not the story.
The real story is better, and it took walking the whole frontier to see it.
Everyone chose the comfortable default
Here is what I noticed once I’d read all of it. The crowd converged - and every point of convergence is the easier choice, not the truer one.
They chose the sidebar over the search bar. A separate panel, a chat surface bolted to the edge of the app - coded as help, not command - instead of upgrading the operational surface the user already trusts.
They chose generative UI over driving the real interface. The dominant pattern is an agent that renders new widgets - “ask to book a flight, a flight card appears” - a parallel, synthetic interface conjured beside the product. It demos beautifully. But it’s a second app growing next to the first. It is not the product’s own real DNS page navigating, the real offending row highlighting, the real rule form prefilling and sitting staged in the real deploy flow. Rendering a fresh widget is easier than learning to operate the interface that already exists. So that’s what almost everyone did.
And they chose read over write. Querying your content, summarizing your data, answering questions - safe, reversible, delightful, and largely solved. Operating a live control plane, where a wrong move costs real money or breaks production - terrifying, and mostly avoided. The hardest and most valuable half got walked past, because the safe half makes for a cleaner launch video.
None of these were stupid choices. They were comfortable ones. And the gap I’d been circling turned out not to be unexplored wilderness. It was the patch of ground everyone crossed on their way to somewhere easier.
The flag: Operable Software
Once I saw that, the framing snapped into place, and it’s the one thing on this entire frontier that nobody is holding cleanly.
Look at the vocabulary of the whole field - GUI agents, AI copilots, agentic UI, autonomous browsers. Every single term puts the AI in the subject position. The perceiver, the planner, the actor. The intelligence is the hero of every sentence. And so the entire conversation is a turf war over who does the operating - the brittle browser agent crawling the DOM from outside, versus the reluctant in-app copilot built from within. Agent versus app. A race.
Step sideways out of that race entirely. Take the AI out of the subject position. Put the software there.
The durable category isn’t “AI that operates apps.” It’s software that declares how it can be operated - operator-agnostic, write-first, with reversibility and blast radius as first-class, declared facts about each action. Not “here’s an agent that figured out my UI.” Rather: “here is my product, stating plainly which of its actions are safe, which are irreversible, where each one lives, how each one previews, and how each one stages before it commits” - true regardless of who is doing the operating. You, through the search bar. A browser agent. A teammate’s automation. Something not yet invented.
MCP made a product’s verbs addressable.
Operable Software makes a product’s operability declarable.
And here’s the move that makes this both safe and large, rather than just another entrant in the war: a product that declares its operability stops competing with the browser agent and becomes the thing the browser agent needs. You’re no longer picking a side. You’re describing the contract both sides have to speak. The generic agent and the native bar both want the same thing - a product that can tell them, honestly, how it can be operated and what it costs to be wrong.
Which is why I think the framing the rest of the field is using is subtly, consequentially off. The protocol of this era should not describe intelligence.
It should describe operability.
Don’t describe the agent. Describe the software.
Back to the bar
I started in Cloudflare, ferrying a WAF rule between two windows like a courier carrying a message between two people who could simply have spoken.
The future I want isn’t a chat window that eats the interface. It’s the interface finally able to answer. You say what you’re trying to do, in your own words, in the bar that’s already there - and the product responds by moving: navigating, highlighting, filling the form, staging the change, waiting for your hand on the last irreversible step. The software learning, at long last, to explain itself and be operated in the language you already think in.
I went looking for a clever new product and found a crowded field that had quietly agreed to skip the hard part. The hard part - drive the real UI, do the writes, declare the danger, let the human stay in the loop - is still sitting there, exactly where the twenty-year pattern always pointed.
I just wanted to talk to Cloudflare.
It turned out that wish had an architecture.
About SG
I run Dobby Ads, an AI Creative Agency. I tend to overthink. This is where that overthinking goes. Connect with me on LinkedIn.


