The Best AI Engineers Are Engineers First

We've been hiring at Arkanis and helping clients build their founding engineering teams. Here's the pattern.

A quick scope note. This is about AI engineer roles at growth stage product companies building features on top of frontier models, shipping to real users. It's not about research scientists working on autonomous driving, robotics, or foundation models. Different work, different hire, different post.

Engineering hiring rigor evaporates the moment "AI" appears in the title. Companies that would never hire a backend engineer on pedigree alone do exactly that for AI roles, then wonder six months in why nothing ships past the demo.

Four traits, in order

The order below matters. Trait one is the foundation where the rest only emerge from someone who has it.

1. Strong engineer first.

Years operating production systems that handle real traffic. Owned bugfixes. Lived through incidents. Knows what production actually demands: observability, cost monitoring, graceful degradation, idempotency. The boring stuff that keeps software working when nobody is watching.

Folks with engineering maturity use AI tools effectively without outsourcing their thinking entirely to the LLM.

Interview tell: Pair program on a problem where your candidate can use AI assistance and watch how they spec out the problem. If they don't know where to look for failure cases, it's a red flag.

2. Knows where frontier models fail.

The hire you want has read the same hype cycle posts you have and is unmoved by them. They know the gap between what the vendor marketing says and what the model does in production.

Interview tell: ask them to name a frontier model limitation that contradicts the marketing.

3. Keeps it simple, ships, monitors, acts.

The pattern: reach for the smallest design that handles the known requirements and the failure modes you can reasonably predict. A single LLM call with a structured output schema. Retries. An eval on yesterday's traffic. A feature flag. A dashboard. Five things, none skippable, built day one because the senior engineer doesn't think of them as optional.

The trap is that "simple" gets used to justify the opposite. No retries because the happy path works. No offline eval because we'll add it later. No A/B testing on live traffic because shipping big changes faster feels like progress. Same words, very different systems.

4. Treats measurement as the deliverable.

The hire you want ships an experiment that produces a feature when it earns its place. They pick a business metric, design the smallest variant that could move it, run it, and come back in business terms. "Ticket deflection moved from 22% to 27%, here's what it costs in inference, here's the regression we caught in eval that still needs work."

That conversation is what makes AI investment manageable from the exec seat. Demo builders rarely have a number to show, which is why the conversation about their work always ends up being about vibes.

The thread

Read as a checklist this is obvious. Strong engineer, model literacy, ships simply, measures everything.

Almost nobody hires this way, and part of the reason is structural. Legacy job boards and matching tools run on search that heavily relies on keywords. Hype beasts get bubbled to the top because their resumes are stuffed with the right vocabulary. The matching tools are configured to surface that profile, and the top K candidates that land on your desk are filtered for it before you see them.

Plenty of diamonds in the rough sit a few pages deeper, the people who aren't asking $250–300K for a senior role where they'll mostly talk your ear off about AI hype. The talent is there.