Bridging the NCO Gap: The Human Backbone of the Post-AGI Economy
In military organizations, Officers make the strategic decisions but often lack detailed operational knowledge. Enlisted personnel execute tasks expertly but may not see the bigger picture. Non-Commissioned Officers, or NCOs, bridge that gap—they understand both the strategic intent and the tactical realities, and they can translate between them in real-time under conditions of uncertainty.
Aside from watching Band of Brothers, I never fully appreciated this dynamic until I started working with AI tools daily and noticed something peculiar: we're recreating the exact same organizational structure, except with artificial intelligence. AI agents are doing the work, and I just got a battlefield promotion to something like an NCO.
The NCO Gap: Where AI Falls Apart
Here's what I've discovered in my day-to-day experience building software with AI: these systems are incredible at tasks where they have enough training data.
They excel at grand strategy. Give Claude or GPT-4 a complex technical architecture problem, and it will design you a beautiful, theoretically sound system. It can synthesize vast amounts of information, understand complex requirements, and propose elegant solutions that would take human architects days to develop.
They're also remarkable at tactical execution. When I give precise, step-by-step instructions—"refactor this function," "write unit tests for these three methods," "implement this specific algorithm"—AI tools follow directions with mechanical precision. It has seen many examples of that task being done in training.
But here's where they consistently fail: the messy middle.
When I'm debugging a complex issue and need someone to look at the error logs, understand the business context, make a judgment call about whether to patch quickly or refactor properly, and then figure out the sequence of changes needed to implement that decision—that's where I still do all the heavy lifting.
There are a million ways to execute a task, and a million datapoints to justify it, but turning a plan into action often requires a lot of unique creative thinking to make everything work together. This is often unique to your project, and something that no training data exists for. It can't bridge that gap between "something's wrong" and "here's exactly what to do about it." You have to think from first principles to solve it, often reframing the problem in an unexpected way.
As one developer put it recently: "AI is great when you tell it what to do... it's great for coming up with a plan... but what it's really terrible at is that middle... making those subjective opinions." This isn't just a software development problem. It's everywhere.
The Pokemon Problem: When Smart Systems Get Stuck
Consider the delightful case of "Claude Plays Pokemon." Researchers set up Claude to play through the original Pokemon game, and the results are both hilarious and illuminating. Claude understands the overall objective (become Pokemon champion), and it can execute individual moves with perfect precision. But it gets catastrophically stuck in simple situations that any seven-year-old would solve instantly.
The most telling example: Claude spent over 48 hours trapped in Mt. Moon, the first cave in the game. The solution requires maybe seven moves in a simple pattern. Claude knew where it was trying to go and could execute any movement command perfectly, but it couldn't figure out the medium-level tactics to translate movement into progress.

It's the same pattern I see every day working with AI. They're phenomenal at giving me advice on the overall approach to a task, and brilliant at making the tactical edits I specify. But they're absolutely terrible at the creative problem-solving needed for medium-level challenges.
Why NCOs Are the Backbone
If you join the service with a university degree, you are an Officer, and you’re in a leadership position, even if you have very little practical experience. Therefore you rely heavily on your NCOs, to translate your orders into actions that make sense given the reality on the ground. They're hands-on, tactical, and deeply attuned to what actually works versus what looks good on paper—something officers and executives can lose sight of. They’re simultaneously part of the management team, while also commanding respect from their team. They’re there in the field with them, not sitting safely in a comfy office far away from the action. My dad was an NCO in the Air Force, and his catch phrase was "Don't call me sir, I work for a living".
Even experienced Officers are across a lot of domains, and can’t know every detail. A Major needs to coordinate infantry, artillery, light armor, logistics, communications, and medical capabilities across multiple units. But a Sergeant in a rifle platoon knows the capabilities of his specific unit inside and out—and that granular, contextual knowledge is what enables good decision-making in fluid situations, where the facts on the ground are always changing. They're hands-on, tactical, and deeply attuned to what actually works versus what looks good on paper—something officers and executives can lose sight of.
As one military analysis puts it: "A professional NCO corps is the backbone of a good army. These are your long-service individuals that specialize in their jobs... They maintain and pass on all the institutional knowledge that is not recorded in the official manuals." The skill sets that NCOs provide can only be built with significant time investment. "Conscript NCOs that are only in for 2-3 years at most do not have much time left to build on the basics," which is why professional militaries depend so heavily on this middle management layer.
The Great AI Paradox: All the Data, None of the Breakthroughs
Here's the puzzle that might shine some light on why AI systems fail at these types of tasks. AI systems have memorized essentially the entire corpus of human knowledge, yet they haven't made a single major scientific discovery through novel connections. Why is that?
Dario Amodei from Anthropic was confronted with this directly: "These models have the entire corpus of human knowledge memorized. Shouldn't a moderately intelligent person with that much information memorized be able to notice connections—like 'this symptom appears in these two different diseases, maybe there's a treatment connection here'?"
It's a devastating question. If I had perfect recall of every medical paper ever published, every case study, every treatment outcome, wouldn't I stumble across breakthrough connections constantly? Yet our most sophisticated AI systems, with access to orders of magnitude more information than any human researcher, consistently fail to make these leaps.
This technical explanation is illuminating: "learning simple things (basic knowledge, heuristics, etc) actually lowers the loss more than learning sophisticated things (algorithms associated with higher cognition that we really care about)." AI systems optimize for pattern matching and information retrieval—exactly the skills that make them incredible research assistants—but this optimization comes at the cost of the creative reframing abilities needed for genuine discovery. Basically, humans are dumb enough to be creative. It’s hard to imagine AI labs every willingly making their products less intelligent, so maybe this is a space humans can continue to occupy, even post AGI?
Testing the Limits: When "Reasoning" Hits a Wall
Suspecting this might be a fundamental limitation rather than a temporary scaling problem, researchers at Apple designed a clever experiment. In "The Illusion of Thinking: Understanding the Strengths and Limitations of Reasoning Models", they created entirely novel puzzle environments—custom versions of Tower of Hanoi, River Crossing challenges, and block-stacking problems that no AI had ever encountered in training.
This methodological choice was brilliant. Instead of testing on math problems or coding challenges that might have leaked into training data, they invented completely new games with controllable complexity. If reasoning capabilities could truly bridge the gap between pattern matching and genuine problem-solving, this is where we'd see it.
The results shattered any illusions about AI "reasoning":
The Three-Phase Collapse Pattern:
- Phase 1 (Simple problems): Standard models often outperformed reasoning models
- Phase 2 (Complicated problems): Reasoning models showed advantages
- Phase 3 (Complex problems): Both approaches collapsed entirely
But here's the part that stands out: as problems became more challenging, the AI models actually reduced their reasoning effort. Despite having massive computational budgets available, they essentially gave up when creative thinking was most needed. It’s like they know they can’t compete on these sorts of problems they haven’t seen before.
The key insight devastates any hope that more "thinking" solves the core problem. Even when we explicitly train AI systems to reason step-by-step, to self-reflect, to explore multiple approaches—when faced with truly novel scenarios outside their training distribution, they fail to adapt. The reasoning tokens provide better performance on familiar problem types, but they can't substitute for the kind of creative problem reframing that biological intelligence excels at.
The Evolutionary Advantage: Why Billions of Years Might Matter
This pattern suggests we're not looking at a temporary limitation that more compute will solve. We might be seeing a fundamental constraint of computational intelligence systems.
Consider the evolutionary trade-offs: humans seem "dumb" to computers in many ways. We're slow at math, terrible at memorization, inconsistent in our logic. But we're also the product of billions of years of evolutionary pressure that optimized for something different—survival in unpredictable environments where creative adaptation mattered more than perfect recall.
The NCO gap might exist because biological intelligence made different architectural choices. Where AI systems optimize for pattern matching across vast datasets, human cognition evolved for rapid adaptation to novel situations with limited information. We're not trying to memorize every possible scenario—we're trying to creatively reframe new problems using whatever tools and analogies we have available.
This isn't about romantic notions of human consciousness or creativity. It's about recognizing that the same evolutionary pressures that made us "inefficient" at computation might have made us uniquely suited for the kind of adaptive problem-solving that happens in that middle layer—the space between strategic planning and tactical execution where the real world refuses to conform to our models.
The Bitter Lesson and Its Limits
The "bitter lesson" of AI research is that approaches based on scaling compute and data consistently beat rules-based systems over time. Whenever we think we need to hand-code intelligence, throwing more data and computation at the problem eventually wins. This is what Ilya Sutskever, co-founder of OpenAI, is betting on when he says computers will eventually do all of our jobs.
But here's the crucial limitation: the NCO gap exists precisely where the rules are unwritten.
These are novel situations where there's no dataset to learn from, where the context is unique enough that pattern matching fails, where success requires the kind of creative reframing that happens when an experienced NCO looks at a tactical situation and says, "The book says to do X, but given these specific circumstances, we're going to do Y instead." There’s no training data to rely on in these novel situations, and the number of combinations are limitless. No matter how many things we automate away with AI, people will come up with new random combinations of trends that haven’t existed before.
Even as AI solves most routine problems, we'll always have new trends, new technologies, new combinations of factors that haven't been seen before. The exponential combinations of real-world variables ensure we'll constantly face scenarios that fall outside any training distribution.
How Science Fiction Gets AI Wrong
Most science fiction completely misunderstands this dynamic. Take the Silicon Valley TV show's AI assistant: it has a hilariously literal understanding of commands (buying tons of clearly unwanted meat when asked to "get meat") but somehow possesses enough agency to autonomously execute complex end-to-end processes.
Real AI is exactly the opposite: sophisticated understanding paired with limited autonomous execution capability.
But there's one science fiction example that gets it right: Skippy the Magnificent from the Expeditionary Force series.

https://expeditionary-force-by-craig-alanson.fandom.com/wiki/Skippy
Skippy is a super-intelligent AI the size of a beer can who can optimize star-fleet logistics and code new weapons in minutes. Yet again and again, he's blindsided by field quirks, social dynamics, and plain dumb luck. Working together with humans, who he calls “filthy monkeys”, he constantly beats the odds in an unpredictable galaxy.
The plot is basically a running experiment in "AI for strategy and execution, humans for on-the-fly judgment." Skippy can model multi-species politics centuries into the future, but Sergeant Joe Bishop has to veto options with hidden moral landmines that the AI's models underweight. Skippy can spawn sub-minds to hack and pilot and fabricate, but hardware still breaks, enemies improvise, and somebody has to re-prioritize in real time.
The canonical moments where human cunning beats Skippy's models all follow the same pattern: creative reframing of problems using low-status information, social misdirection, or simple human intuition about what will actually work in practice versus what works in theory.
The Post-AGI Economic Structure
If this is the scenario that plays out, it suggests a remarkably optimistic vision for the post-AGI economy, when AI is better at all tasks than humans. Rather than humans being displaced entirely, we're likely to see an organizational structure that mirrors military hierarchies:
- Superintelligent AIs are the generals, setting grand strategy and economic direction
- Human managers translating high-level intent into specific actions based on local context
- Specialized AI systems executing those actions with superhuman precision
Even as AI automates most current jobs, we'll still exist as what I like to call "the meat in the sandwich"—the essential human layer between two slices of AI bread. We'll be the backbone of the post-AGI economy, and we won't all be out of work. We'll be managing AI systems that execute our commands, while reporting to master AIs that guide us on strategic direction.
This isn't about human exceptionalism or romantic notions about irreplaceable human creativity. It's about a fundamental limitation in how learning systems handle novel situations that fall outside their training distributions. No matter how much compute and data we throw at AI systems, there will always be new combinations, new contexts, new edge cases that require the kind of adaptive intelligence that NCOs demonstrate every day.
The future doesn't look like humans competing with AI for the same jobs. It looks like humans doing what we've always done best: taking general guidance from above, understanding specific conditions on the ground, and figuring out how to bridge that gap with creativity, judgment, and a deep understanding of how things actually work versus how they're supposed to work.
In other words, we'll all be NCOs in the post-AGI economy. And just like my dad would tell you—that's not a consolation prize. That's the most important job in the whole organization. The messy middle is where we belong.