The job interview is a prediction mechanism. It exists to answer one question: is this person likely to perform well in this role? That question is rational, and the tools developed to answer it were, for most of the twentieth century, reasonably well-calibrated. Test for the skills that have historically produced value, and you can make defensible predictions about who will produce value next. The problem is that technological intelligence is crossing domain thresholds faster than hiring practices update — and the interview is increasingly measuring skills that are no longer the bottleneck while failing to measure the ones that are.
What the Interview Was Designed to Do
The logic was sound. Legal document review rewarded speed and accuracy in reading thousands of pages for relevant material. Law firms tested for reading comprehension, attention to detail, stamina under volume. Medical diagnosis rewarded pattern recognition across symptom clusters and imaging data. Board exams tested for that pattern recognition at scale. Code generation rewarded facility with syntax, algorithms, and language-specific idiom. Coding tests tested for exactly that. The interview was downstream of a theory of value: here is what this work requires, here is how we measure it. The credential came first, then the interview, and together they sorted candidates by demonstrated proximity to the required skill.
This was a reasonable system. It worked tolerably well when the required skill was stable. It begins to fail when the required skill changes faster than either the credential or the interview can track.
The AlphaFold Pattern
When TI crosses a threshold in a domain, the bottleneck shifts. AlphaFold didn’t just accelerate protein structure prediction — it moved the constraint. Before, the binding bottleneck was computational prediction itself. After, the binding bottleneck became understanding what to do with the predictions: which ones to trust, which edge cases the model handles poorly, which experimental validations are still necessary. The work changed. The people best positioned to do post-AlphaFold biology were not necessarily the ones who were best at pre-AlphaFold structure prediction.
The same dynamic is active in law, medicine, and software. E-discovery TI systems now process document review volumes in 72 hours that previously required months of associate hours. Radiology TI is approaching near-radiologist accuracy on standard reads. Code generation tools are producing working first drafts at a pace that changes what a software engineer’s hour is worth. In each case, the bottleneck has shifted. The interview asks for mastery of the old skill. The actual work increasingly requires something different: the capacity to supervise TI’s output, catch its characteristic errors, ask the questions the system cannot formulate for itself. Those are not the same skill. Most hiring processes have not updated.
The Cognitive Offloading Dynamic
The skills being interviewed for are precisely the ones being offloaded to TI. This is not coincidence; it is a structural feature of how cognitive offloading works. We interview for skills that were recently proven to matter. TI absorbs the skills that matter most at volume and speed. The lag between those two events is where the mismatch lives.
A legal associate is still being evaluated on document review fluency that TI handles more reliably than any individual associate could. A radiology resident is still board-certified on image reads that TI systems are rapidly matching. A software engineer candidate is still whiteboarding data structure implementations that code generation tools produce on demand. The interview tests for the cognitive function that is being externalized. The gap between what is tested and what is needed is widening each year — not because the candidates are less capable, but because the calibration mechanism hasn’t moved.
The Credentialing Problem
The deeper problem is institutional. Credentialing systems — law school, medical residency, software bootcamps — are training people for the pre-threshold version of the work. The institutions doing the training have weak incentives to update, because their business model is tied to the credential rather than the capability. A credential is a claim about historical value. It says: this person demonstrated the skills that were worth demonstrating at the time this program was designed. It says nothing about whether those skills transfer to post-crossing work.
Law schools are not primarily in the business of producing lawyers who can supervise TI document review and catch hallucinations in generated contracts. They are in the business of issuing credentials that are legible to firms who have organized their hiring around those credentials. The credential and the interview form a closed system that is coherent internally and increasingly disconnected from the work it nominally predicts.
This is not a failure of individual actors. No single law school dean, no single hiring partner, decided to let this drift happen. It is a structural feature of institutions that are optimized for credential production in an era when the value of those credentials is shifting faster than the institutions can track.
The Survival Priority Sorting Event
The interview is the mechanism through which survival priority operates at the individual level. It is the sorting event. The entire apparatus — years of credential acquisition, the preparation, the credentialing cost, the interview itself, the outcome — is organized around competing successfully for a scarce supply of stable, well-compensated roles. The stakes are genuine. The person who fails the interview doesn’t just miss a job; they fall behind in a system where falling behind has compounding consequences.
That is survival priority made concrete: a system in which people must prove their value by the metrics of an institution calibrated to a version of the work that TI is actively absorbing. The system is not malicious. It is inertial. It is doing what it was designed to do, with the design lagging the reality by a gap that grows slightly wider each year.
What the Right Test Would Measure
The honest version of the interview — the one calibrated to where the work is actually going — would not ask “can you perform the task?” It would ask: can this person work alongside the TI system that performs the task, supervise its output, and do the things it cannot? That means knowing what the system gets wrong, how errors present, when to trust the output, and what questions only a sapient biological intelligence can formulate. It means having enough domain knowledge to catch a confident-sounding wrong answer, and enough epistemic humility to know when to verify.
That skill is harder to test for. It takes longer to develop. It requires domain depth and something harder to name — a kind of critical distance from the tool, an ability to hold the system’s output at arm’s length even when it sounds authoritative. It does not fit neatly into a 45-minute structured interview. Maximization as a framework would say that the development goal is not producing people who can pass the old test, but people who can do the new work — and that the two are not the same.
The Gap
When the interview and the actual work have quietly diverged, and neither the candidate who prepared for years, nor the institution that trained them, nor the hiring system that evaluates them has fully noticed — someone pays for that gap. The credential was expensive. The preparation was real. The skills are genuine. And yet the sorting mechanism is increasingly testing for the cognitive function that TI is absorbing while the function that actually matters goes unmeasured.
The question is not whether that gap will close. Competitive pressure will eventually force calibration. The question is who absorbs the cost of the lag — and for how long.