In every Tech Due Diligence we run, we now formally score AI maturity alongside the usual suspects: architecture, team, technical debt, security posture. It started as a qualitative note. Today it's one of the most predictive signals we track.
Not because AI tools are universally good. But because the gap between companies that have internalized them and those that haven't is compounding faster than almost any other dimension we measure.
We've identified four distinct levels. Most companies sit between Level 1 and Level 2. Very few have reached Level 4. The ones that have are operating in a different league.
Why this matters more than it did six months ago
The economics of software execution have fundamentally shifted. A solo founder with AI agents and a clear idea can ship a working product in a month. What used to require a team of five (frontend, backend, design, QA, devops) can now be orchestrated by one person who knows how to direct AI systems.
This changes the competitive landscape entirely. Your moat is no longer headcount or even capital. It's clarity: the ability to define precisely what you want to build and for whom. A well-funded incumbent moving slowly can be outflanked by a two-person AI-native startup in a quarter.
Execution is cheap now. The bottleneck has moved from “can we build this?” to “do we know exactly what to build and why?” Business clarity is the new engineering headcount.
This is the context in which the following maturity levels should be read. The stakes for getting this wrong are higher than they've ever been.
Level 1: The Sceptics
“We tried it. The outputs weren't good enough. Our use case is too specialised. Our data is proprietary. Our customers expect human quality.”
These are the rationalisations we hear most often. Every single one has a grain of truth, and every single one is being used to avoid confronting an uncomfortable reality.
The sceptics are not stupid. Many of them are experienced operators who built successful companies before AI was credible, and who have earned the right to be skeptical of hype. The problem is that this time, the hype is load-bearing.
What we observe in sceptic-level companies:
- Engineering teams using no AI tools, or using them ad-hoc and informally without any measurement
- A CTO or engineering lead who hasn't updated their mental model of what a productive developer workflow looks like
- Hiring plans built around the assumption that more engineers equals more output, without questioning whether that relationship still holds
- A cultural posture of “wait and see” that is silently accumulating competitive debt
Our assessment is blunt: if you're running a technology company in 2026 and your team has not meaningfully adopted AI tools, you need to change. Not next quarter. Not after the next hire. Now. A competitor can be started tomorrow and be live in production within four weeks. The asymmetry is real and it is accelerating.
Level 2: The Enthusiasts
“We use it. Some of the team are really into it. We use ChatGPT a lot, some people use Cursor, we're experimenting with Copilot.”
This is the most common profile we encounter. Probably 60% of the companies we assess in 2026. It's better than Level 1, but it carries a false sense of security.
The enthusiast company uses AI tools, but without structure. Adoption is uneven: two engineers use Cursor every day and are measurably faster; three others use it occasionally for autocomplete; two more have tried it once and reverted to their existing workflow. There's no shared playbook, no measurement of impact, no shared toolchain, and no cultural expectation that AI usage is a professional responsibility.
The result: the company gets maybe 10 to 20% of the productivity benefit that's available to them, distributed unevenly across the team. Meanwhile, they believe they're “doing AI.”
There is a subtler cost that rarely gets measured: interruption. At Level 2, AI is used reactively. A developer hits a problem, opens a chat window, types a question, waits, reads the response, copies a snippet, switches back. Each interaction is a context switch. Each context switch breaks the state of deep work that senior engineers need to solve hard problems. Research on developer productivity is consistent on this point: the cost of an interruption is not the time the interruption takes, it is the 15 to 20 minutes of reconcentration required afterwards.
Level 2 companies are using AI in the most expensive way possible: as an async chat tool that fragments attention rather than as an integrated system that runs alongside the developer, invisibly, in their flow. The engineers who get the most out of AI at this stage are the ones who have figured out, on their own, how to keep AI inside their flow state rather than constantly stepping outside of it. That knowledge stays with them. It never becomes a team capability.
What distinguishes enthusiasts from the next level is not the tools they have. It's that they've never designed their workflows around AI. They're using AI to do individual tasks faster. They haven't yet asked: what entire categories of work should AI own end-to-end, without requiring a developer to context-switch at all?
Level 3: The AI Agent Operators
“We've identified the recurring, high-volume tasks in our workflow and automated them with agents. We have a playbook. We measure the impact.”
This is where things get genuinely interesting, and where the productivity gains stop being incremental and start becoming structural. Level 3 companies have made a deliberate decision to identify classes of work that AI agents can own, and have built systems around that ownership.
In engineering teams, this looks like: agents that automatically generate first drafts of tests for every PR, agents that write and maintain documentation, agents that triage incoming bug reports and assign severity, agents that review code against internal style guidelines before a human ever sees the PR.
In product teams, it might look like: agents that synthesize user feedback from multiple channels into weekly product insights, agents that generate and A/B test copy variants, agents that monitor competitors and surface relevant changes.
In operations: agents that handle first-line customer support resolution, agents that generate weekly reporting packages, agents that monitor infrastructure and create incident tickets with context already populated.
The key characteristics of Level 3 companies:
- Clear ownership. Specific agents have defined responsibilities. The team knows what the agent handles and what requires human judgment.
- Measurement. They track the before and after. They know their AI-enabled test coverage, their support ticket resolution rate, their documentation freshness. AI impact is a number, not a feeling.
- A written playbook.New engineers can onboard to the AI workflow in a day. It's not in anyone's head.
- A culture of iteration.When an agent doesn't perform, the team improves the prompt, the context, or the toolchain, just as they would debug code.
In our experience, Level 3 companies see productivity gains well above 5x in the areas they've automated, and in some cases closer to 10x for tasks that were previously pure repetition. The multiplier depends on the task, but it is consistently higher than most teams expect before they measure it. Critically, they've also freed their senior engineers to focus on the work that actually requires senior judgment: architecture, product decisions, customer conversations.
Level 4: The AI Operating System
“We don't think in terms of workflows anymore. We think in terms of agents, roles, and orchestration. The company is an OS. Humans define strategy and review exceptions. Agents do the rest.”
Very few companies have reached this level. The ones that have are building something qualitatively different. Not just a faster version of a traditional company, but a new kind of organisation.
At Level 4, AI agents are first-class members of the organisation. The org chart includes agent roles with defined responsibilities, escalation paths, and performance criteria, just like human roles. There are agents responsible for marketing research, agents responsible for competitive intelligence, agents responsible for first-pass sales outreach qualification, agents responsible for financial reporting preparation.
The infrastructure layer becomes critical here, and it is where Level 4 companies invest seriously. Trigger.dev handles background jobs and agent orchestration at scale. OpenAI Symphony, LangGraph, and CrewAI coordinate multi-agent pipelines with stateful, branching logic. Langfuse provides full observability over every agent run: traces, latency, cost, and quality scoring, so the team can debug and improve agent behaviour the same way they would any production system. Linear becomes the connective tissue for product work, with agents creating, triaging, and closing issues autonomously as part of the development loop. And emerging platforms like Hermes and PaperClip, purpose-built for managing fleets of AI agents in a business context, form the organisational layer on top. Humans are no longer the default executor of recurring work. They are the strategists and the system designers.
The productivity ceiling at Level 4 is not 5x or 10x. It is genuinely open-ended. A Level 4 company with 10 people can operate with the surface area of a 100-person company, because the agents handle the volume while the humans handle the judgment.
Level 4 is not science fiction. We're building it ourselves, right now. Take TechSignal, our AI-powered engineering signal tool for investors and founders: conceived, designed, and shipped by a tiny team using exactly this model. The agents do the data gathering, the analysis, the formatting. Humans set the direction and review the output.
There will be a before and after 2026
We're at an inflection point that will be talked about the way people talk about the shift to cloud computing or the mobile revolution, except it's moving faster.
In 2024, AI in enterprise was mostly about productivity experiments and pilot projects. In 2025, it became table stakes for engineering teams. In 2026, the companies that have built AI-native workflows are beginning to pull ahead in ways that are visible in the numbers: faster release cycles, lower cost per feature shipped, higher employee leverage ratios.
The companies still at Level 1 are not just missing productivity gains. They are accumulating a structural disadvantage that gets harder to close every month. Their competitors are shipping faster. Their burn-to-output ratio looks worse to investors. Their engineers are watching peers at AI-native companies do more interesting work with less friction.
The window to close this gap is not infinite. The teams that are currently at Level 3 are already designing the Level 4 systems that will define the next two years. The gap between Level 1 and Level 4 will be very difficult to close once it's structural.
What this looks like in a Tech Due Diligence
In every engagement, we now assess AI maturity as part of the engineering evaluation. The signals we look for:
- What percentage of engineers use AI coding tools daily, not occasionally?
- Does the company have a documented AI workflow? Or is it left to individual discretion?
- Are there AI-resistant engineers whose posture could slow adoption as the company scales?
- Has the CTO or technical lead articulated a roadmap for moving from their current level to the next?
- Are there specific agent automations already in production, not pilots or experiments, but live systems handling real work?
- How does the team measure AI productivity impact? If they can't, they're probably at Level 1 or 2.
A company at Level 3 with a clear roadmap to Level 4 is a fundamentally different asset than a Level 1 company with double the headcount. Investors who aren't measuring this are leaving signal on the table.
From sceptic to Operating System: a real example
In January 2026, we ran a Tech Due Diligence on a scale-up where virtually the entire engineering team was AI-sceptic. Including the CTO. They had built a real product with real customers, but they had done it entirely the old way, and they were proud of it.
The DD surfaced significant technical debt. Our estimate, without AI tooling, was six months of careful refactoring work to bring the codebase to the standard the incoming investor expected. A long timeline, real execution risk, and a condition that could have killed the deal.
Our report was positive, but with a clear condition: the technical debt had to be addressed using AI agents, not the traditional approach. Specifically, we recommended a combined strategy of AI-assisted refactoring and automated test generation, so that the team could move fast without destabilising a product that customers were actively depending on. The tests would catch regressions in real time; the agents would do the heavy lifting on the refactoring itself.
The team followed the recommendations. They completed the refactoring in three weeks.
Six months of estimated work. Three weeks of actual work. The only variable was the decision to use AI agents properly, with a clear system behind them.
The transformation that followed was just as striking as the technical result. A team that had been dismissive of AI tools became genuinely enthusiastic, because they had experienced the difference firsthand rather than being told about it. We had the pleasure of working with them on their AI strategy in the months that followed. Today, they are implementing a clear Operating System with autonomous agents. They moved from Level 1 to Level 4 in under six months.
This is what the journey looks like in practice. It is not a multi-year transformation programme. It is a decision, followed by the right system, followed by execution.
How we help
At Above The Clouds, we work with companies across the entire maturity spectrum. Moving from Level 1 to Level 2 requires a change in posture and a hands-on introduction to the tools. Moving from Level 2 to Level 3 requires workflow design, measurement frameworks, and team enablement. Moving from Level 3 to Level 4 requires thinking about your organisation as an orchestration system and designing the agent architecture to match.
We're not theorists here. We're bootstrapping multiple AI-native products ourselves, including TechSignal, an AI-powered engineering signal tool for investors and founders built on an AI agent OS, and we bring that operational experience into every engagement. We know what works in production, not just on a whiteboard.
If you're a founder who knows you need to move faster on this, or an investor who wants AI maturity assessed as part of a DD, we start with an AI Readiness Audit. Two days. A clear picture of where you are, where you need to be, and what it will take to get there.
Above The Clouds runs Tech Due Diligences and AI Transformation programmes for scale-ups and investors across Europe. We assess AI maturity as a core element of every engagement. Get in touch to discuss your company or portfolio.