The Bottleneck Just Moved
On March 5, OpenAI released GPT-5.4.
The benchmark that got attention: OSWorld-Verified — a test that measures an AI’s ability to operate a real desktop environment. GPT-5.4 scored 75%. Human baseline: 72.4%.
A machine is now better than the average human at operating a computer.
The coverage focused on jobs. Which roles are at risk. What this means for white-collar work. Whether executives should be worried.
Those are the wrong questions.
Here’s the question nobody asked: if AI just got better at doing the work, why did a BCG study of 1,488 workers — published the same month — find that cognitive fatigue among knowledge workers is increasing?
The data that doesn’t fit the story
The BCG study measured something specific: what happens to people who actively use and oversee AI tools at work.
The findings are precise enough to be uncomfortable.
Workers managing high levels of AI oversight reported 14% more mental effort, 12% more mental fatigue, and 19% more information overload than those who weren’t. The ones experiencing what researchers named “AI brain fry” — mental exhaustion from excessive AI use and oversight — showed 33% more decision fatigue and made 39% more major errors.
14% of AI-using workers reported experiencing this.
In marketing departments: 26%. Among high performers: disproportionately higher.
HBR published it. CNN covered it. Fortune ran the headline. And still, somehow, the framing was: people are using too many AI tools, they should use fewer.
That’s not the problem. That’s the symptom.
What’s actually happening
Every time AI capability increases, the execution ceiling rises.
More work gets done. Faster. At higher quality. But every unit of execution generates coordination overhead: something to review, a decision triggered, a loop opened that someone has to close.
The bottleneck was never execution. The bottleneck was always coordination.
AI that executes 10x faster at human-level quality doesn’t reduce coordination overhead. It multiplies it. More outputs in less time means more review cycles, more branching decisions, more context to hold, more threads that are now open and waiting.
You are not the one doing the work. You are the one managing everything that comes out of it.
And that number — the cognitive cost of managing — scales with AI capability. Not against it.
The specific failure mode
The BCG data found something that clarifies this precisely.
Productivity increases when workers use one to three AI tools. At four or more, it drops.
The obvious reading: use fewer tools. But that’s not what the data is showing.
The data is showing where the coordination overhead exceeds the execution gain. At some threshold, the cognitive cost of being the middleware — the human who stitches together what five AI tools produce, re-establishes context for each one, manages the outputs, routes decisions — exceeds what the tools saved.
You are not using too many tools. You are doing the coordination work that no tool is doing.
Every tool you add is another system that doesn’t know what the others know. Another context to maintain. Another loop that opens somewhere and closes nowhere.
The Jensen Huang question
At GTC 2026, Jensen Huang said something that didn’t get enough scrutiny:
In a decade, every engineer will have 100 AI agents working alongside them.
The press covered the number. 100 to 1. Massive leverage. The agentic future is here.
Nobody asked the follow-on question: who coordinates the 100?
If one AI tool creates coordination overhead — and it does, measurably — what happens when every knowledge worker has a hundred of them? Each one generating outputs. Each one opening loops. Each one requiring review, context, decision, closure.
The execution problem gets solved. The coordination problem becomes catastrophic.
This is not speculation. The BCG data is early evidence of what happens even at four tools. Extrapolate to a hundred.
The ceiling moves. It always was coordination. It just becomes more visible when AI is doing the work.
What the coverage keeps missing
The wrong prescription — “use fewer AI tools” — is wrong for a structural reason.
The best tool for writing is not the same as the best tool for research, which is not the same as the best tool for scheduling, which is not the same as the best tool for synthesizing meeting notes. Consolidating to one mediocre general tool doesn’t reduce coordination overhead. It just replaces the fragmentation cost with a capability cost.
The problem isn’t the number of tools. The problem is that no layer above the tools is doing the coordination work.
What’s missing is something that holds context across all of it. That knows what you’re working on, what’s open, what resolved, what changed since Tuesday, what the thread from your 9am call is still waiting for. Something that acts on that model — routing, surfacing, closing — so you don’t have to be the middleware between systems that were never designed to talk to each other.
That layer doesn’t exist yet. Not as infrastructure. Not in any tool you can download today.
What the benchmark is actually telling you
GPT-5.4 scored 75% on desktop productivity tasks. Humans: 72.4%.
This is not a threat. It’s a preview.
The gap between what AI can execute and what humans can coordinate is about to become the defining constraint of knowledge work. Not talent. Not headcount. Not tools. Coordination infrastructure.
When AI can do the work at human level — and we are past that point — the question is no longer how good the execution is.
The question is: who closes the loops?
That’s the bottleneck. It just moved.
Eliran Keren — Founder of Deeplica, building the coordination layer for humans who’d rather direct than operate.
Sources: OpenAI GPT-5.4 Launch · BCG/HBR — When Using AI Leads to “Brain Fry” · Fortune — AI Brain Fry Study · The Decoder — Cognitive Limits Overseeing AI Agents · Jensen Huang — 100 Agents per Engineer