Ask HN: A proposal for interviewing "AI-Augmented" Engineers
3 points
13 hours ago
| 1 comment
| HN
Hi HN,

I’m currently rethinking our hiring process. Like many of you, I feel that traditional algorithmic tests (LeetCode style) are becoming less relevant now that LLMs can solve them instantly. Furthermore, prohibiting AI during interviews feels counter-productive; I want to hire engineers who know how to use these tools effectively to multiply their output.

I am designing a new evaluation framework based on real-world open-source work, and I would love the community’s feedback on whether this sounds fair, effective, or if I’m missing something critical.

The Core Philosophy: We shouldn't test if a candidate can write syntax better than an AI. We should test if they can guide, debug, and improve upon an AI's output to handle the "last mile" of complex engineering.

The Proposed Process:

1. Task Selection (Real World Context) Instead of synthetic puzzles, we select open issues or discussions from public GitHub repositories that share a tech stack with our product.

    Scope: 2–4 hours.

    Types: Implementing a feature based on a discussion, fixing a bug, or reviewing a PR (specifically one that was eventually rejected, to test "taste").

    Ambiguity: Adjusted for seniority. Junior roles get clear specs; senior roles get vague problem statements requiring architectural decisions.
2. Establishing the "AI Baseline" Before giving the task to a candidate, we run it through current SOTA models with minimal human intervention.

    The Filter: If the AI solves it perfectly on the first try, we discard the task.

    The Sweet Spot: We are looking for tasks where the AI gets 80% right but fails on edge cases, context integration, or complex logic. The problem setup should not be too easy or too hard.
3. The Candidate Test Candidates are required to use their preferred AI coding tools. We ask them to submit not just the code, but their chat/prompt history.

How We Evaluate (The "AI Delta"):

We aren't just looking at the final code. We analyze the "diff" between the Candidate’s process and our "AI Baseline":

    1. Exploration Strategy: How does the candidate "load context"? Do they blindly paste errors, or do they guide the AI to understand the repository structure first? We look for a clear understanding of the existing codebase.

    2. Engineering Rigor (TDD): Does the candidate push the AI to generate a test plan or reproduction script before generating the fix? We value candidates who treat the AI as a junior partner that needs verification.

    3. The "Last 10%" (Edge Cases): Since we picked tasks where AI fails slightly, we look at how the candidate handles those failure modes. Can they spot the boundary conditions and logic errors that the LLM glossed over?

    4. Documentation Hygiene: We specifically check if the candidate instructs the AI to search existing documentation and—crucially—if they prompt the AI to update the docs to reflect the new changes.

    5. Engineering Taste (The Rejected PR): For the code review task, we ask them to analyze a PR that was rejected in the real world (without telling them). We want to see if their reasoning for rejection aligns with our team's engineering culture (maintainability, complexity, clarity, etc.).
My Questions for HN:

    Is analyzing the "Chat History" too invasive, or is it the best way to see their thought process in 2026?

    For those of you hiring now, how do you distinguish between a "prompt kiddie" and a senior engineer who is just very good at prompting?

    Does the 2-4 hour time commitment feel reasonable for a "take-home" if the tooling makes the actual coding faster?
Thanks for your insights!

(Full disclosure: In the spirit of this topic, this post was composed by AI based on my draft notes.)

raw_anon_1111
2 hours ago
[-]
I interview like I always interview - behaviorally.

I filter for “smart and gets things done” (Joel Spolsky circa 2001).

“tell me about the project that you are most proud of” and then we talk about the architecture, tradeoffs, technical and business complexities, etc.

“I see you’ve been working for $x years. I’m sure there is a project you look back on knowing what you know now and cringe. Tell me about the project and what would you do differently?”

There are a few other questions. But I am usually also trying to measure soft skills and what level of scope and ambiguity they are comfortable with. The last thing I’ve ever needed when I am looking to hire is another “ticket taker”.

Even before AI, why would ever hire a junior dev? They are practically useless, do negative work and easy enough to poach someone with experience from another company for only slightly more money if you paying standard enterprise dev wages.

reply