Real Stories
Thought Leaders

“We stopped asking if candidates use AI. We started asking what they catch.”

A VP of Engineering on hiring in an AI-native world, the productivity metrics that mislead, and the organisational question nobody is asking loudly enough.

Question

Tell us about your vantage point — what gives you the perspective you have on this?

I’ve spent the last twelve years in engineering leadership across two scale-ups and one enterprise. Currently I’m VP of Engineering at a mid-sized product company — around 60 engineers, three product squads, distributed across Europe. We’ve been intentional about AI adoption for the last two years, which means I’ve watched the full cycle: the enthusiasm, the disappointment, the recalibration, and the workflows that have actually stuck. I’m also close enough to hiring to have changed our process significantly as a result of what we’ve seen.

Question

How has your definition of a strong engineering hire changed?

The honest answer is that it has shifted from what people can produce to what they can verify.

Eighteen months ago we updated our technical assessment process. We stopped giving candidates blank-slate coding problems and started giving them something different: a working but flawed codebase — the kind of thing an LLM might produce — and asking them to audit it. Find the architectural risks, identify the anti-patterns, explain what you’d change and why.

The signal we were looking for changed completely. We stopped measuring generative capacity and started measuring review capability. Because that is the actual bottleneck now. The organisations that win in 2026 will be those that successfully transition their teams from code generators to system verifiers. You do not need fewer engineers — you need engineers with a fundamentally different operating model.

The candidates who performed best on that assessment weren’t always the ones with the longest AI tool lists on their CV. They were the ones who could tell you, specifically, why a piece of AI-generated code was going to fail in production — and what they’d do about it.

“We stopped asking if candidates use AI. We started asking what they catch.”

Question

What does a genuinely AI-native engineering team look like — versus where most teams actually are?

There’s a wide gap, and I think most leaders are underestimating it.

The teams I’d call genuinely AI-native have done something structural, not just cultural. They’ve redesigned how work gets planned and reviewed — not just what tools people use at their individual workstation. The planning layer has changed: clearer specs, more explicit interface definitions, tighter scope before a single line is written. The review layer has changed: more rigorous, not less, because the volume of output has increased and the subtle errors are harder to spot when code arrives quickly and looks confident.

Most teams haven’t done that. They’ve given people access to tools, seen some individual productivity gains, and called it adoption. Forrester’s 2025 report found that 68% of AI projects stall at the integration layer, not the model layer. I’d say something similar applies at the team level — most of the value is getting lost in the coordination and review layer, not the code generation layer.

The gap between “we use AI tools” and “we are an AI-native team” is probably two to three years of deliberate habit change. Most organisations are at month six and thinking they’re done.

“The gap between ‘we use AI tools’ and ‘we are an AI-native team’ is probably two to three years of deliberate habit change. Most organisations are at month six and thinking they’re done.”

Question

Where are engineering leaders most likely to get this wrong?

Three places, consistently.

The first is measurement. Leaders default to measuring what AI tooling does to individual output — lines of code, tickets closed, time-to-merge. Those numbers go up. What also goes up, often invisibly, is review time and the rate of subtle defects that pass initial testing. You’re competing for scarce people and measuring the wrong things about the ones you have.

The second is the hiring filter. Most job descriptions now say “experience with AI tools” as a checkbox. That screens for awareness, not capability. A developer who has tried Copilot and a developer who has rebuilt their entire planning and verification workflow around AI tooling are not the same hire. The interview process rarely distinguishes between them.

The third is what I’d call the confidence problem. AI tools produce output that looks authoritative. Developers — especially less experienced ones — are susceptible to trusting that confidence. The teams that have navigated this well have built explicit norms around scepticism: treat AI output the way you’d treat code from someone you don’t know yet. Review it. Don’t skip that step because it arrived quickly.

Question

What’s your honest read on the AI productivity evidence?

Complicated, and I’m sceptical of the headline numbers.

The individual-level productivity gains from controlled studies are real — roughly 20–30% on specific coding tasks appears consistently across the literature I’ve seen. What I haven’t seen demonstrated convincingly is that those gains translate cleanly to team-level output improvements. Writing code faster is one constraint in a system with many constraints. Removing one bottleneck tends to surface the next one.

A 2025 Databricks survey found that organisations with a documented build-vs-buy decision framework deployed AI to production 45% faster than those deciding ad hoc. I’d extend that logic to team adoption: organisations that have been deliberate about how they integrate AI — what habits they build, what they measure, what they don’t delegate — are outperforming those who let it happen organically. The discipline matters more than the tool.

The metric I’ve found most useful isn’t speed. It’s the ratio of time spent on decisions that require genuine judgement versus time spent on tasks that don’t. AI-native teams have shifted that ratio. That is the thing worth measuring — and almost nobody is.

“The metric I’ve found most useful isn’t speed. It’s the ratio of time spent on decisions that require genuine judgement versus time spent on tasks that don’t.”

Question

What’s the organisational question that doesn’t get asked enough?

What happens to the early-career pipeline.

There is a significant structural risk emerging in engineering leadership, and it isn’t technical. It is demographic. As AI coding assistants automate foundational tasks — boilerplate generation, unit testing, documentation — the immediate economic justification for hiring junior developers is being challenged. Many organisations are shifting toward a senior-only model, freezing entry-level headcount, or outsourcing the work that used to develop early-career engineers.

That logic makes sense in the short term. It is a serious long-term mistake.

Senior engineers who can manage AI output, verify its quality, and make sound architectural decisions became senior by doing the foundational work first. If we remove the foundational rung from the career ladder, we are cutting off the supply of the exact people we will need in five years. The organisations accelerating past that question today are creating a talent problem they won’t see until it’s expensive.

The answer isn’t to stop using AI. It’s to redefine what early-career development looks like in an AI-native environment. That is a hard organisational design question and most leaders are avoiding it.

“If we remove the foundational rung from the career ladder, we are cutting off the supply of the exact people we will need in five years.”

Question

How should hiring managers think about AI-native contract developers specifically?

Differently than they used to, and I don’t think most of them have updated yet.

On the positive side: a developer who is genuinely AI-native — who has rebuilt their workflow around planning, generation, and verification — compresses the time-to-contribution window further. They document better because AI makes documentation less painful. They scope work more precisely because they’ve learned that vague inputs produce useless outputs. Those habits benefit your team beyond the individual contribution.

On the risk side: an AI-native developer who hasn’t developed the verification discipline can produce plausible, confident, subtly wrong output at speed. That’s a new failure mode that permanent team members need to be equipped to catch.

What I’d tell hiring managers: don’t ask if a contract candidate uses AI tools. Ask them what AI gets wrong in their specific context. Ask them what they check before they push. Ask them about a time AI output caused them a problem and what they did about it. Those questions surface the capability that actually matters.

Question

What are you watching that others probably aren’t — yet?

The agentic layer, and specifically what it does to the scope of what a single developer can take on.

We are early in the transition from AI-assisted coding to AI-agentic development — where a developer is directing multiple agents working in parallel rather than prompting a single tool. The productivity ceiling in that model is significantly higher than what we’ve seen so far. But the verification challenge scales up with it. You are no longer reviewing one stream of AI output. You are reviewing several, potentially interacting.

The developers who are going to be most valuable in that environment are the ones building orchestration and verification skills now — before the tooling is mature, before there are established patterns to follow. That is a small cohort at the moment. It won’t be small for long.

VP of Engineering

Product company · 60-person distributed engineering team

Interview by Navigaite·

This interview was conducted as part of Navigaite’s Real Stories series — honest perspectives from engineering leaders on what AI-native development actually means for teams, hiring, and the way work gets done.

Want developers who work this way?

Every contractor we place uses AI tooling as a standard part of how they deliver. Tell us what your team needs.

Get in touch