Explicit knowledge · High cost of error
Augment with verification. AI can draft, but errors are expensive — keep human review and accountability. E.g. radiology scan-read with physician final sign-off.15
Jobs are bundles of tasks. AI acts on tasks, not job titles — which is why occupation-level predictions of mass displacement keep missing, and why the operative question is which tasks to hand over, not which jobs to cut.
A compiled, source-verified research digest — every claim cites a downloaded source, every figure is drawn from the data behind it. Not a personal essay.
The dominant frame for AI and work — “which jobs go away” — is the wrong unit of analysis. Labor economics has converged on a task-based view: an occupation is a bundle of distinct tasks, technology substitutes for some and complements others, and the net effect on a job depends on that mix.1 When the unit shifts from job to task, the numbers move dramatically. A whole-job model put 47% of US employment at “high risk” of computerization;9 a task-based model of the same economies put the share of jobs that are highly automatable at roughly 9%.10 The 2017 Suitability-for-Machine-Learning rubric found that variation in automatability is mostly within occupations, not across them — few if any jobs are fully automatable, so realizing value requires re-bundling tasks, not replacing people.45 This paper traces the task-based view from Autor through Acemoglu & Restrepo, the empirical exposure measures, and the demand side (Jevons’ paradox), then offers a decision frame — cost-of-error by knowledge type — for where to deploy AI versus keep a human in the loop. The canonical cautionary tale, Geoffrey Hinton’s 2016 call to “stop training radiologists,” is examined against what has actually happened to the profession since.15
Public debate about AI and employment almost always runs at the level of the job title: will AI replace radiologists, paralegals, copywriters? The framing is intuitive and almost always misleading, because firms do not buy or automate “jobs” — they automate tasks. The task-based view, formalized by Autor, Levy and Murnane and now standard in labor economics, treats an occupation as a bundle of distinct but interrelated tasks; a technology substitutes for the tasks it can do cheaply and complements the tasks it cannot.1 The net effect on the job is the weighted result of both forces, and it can be positive or negative.4
The single most consequential fact this reframing produces: automatability varies far more within an occupation than across occupations. When Brynjolfsson, Mitchell and Rock scored 18,156 O*NET tasks across 964 occupations, the within-occupation standard deviation of task suitability was about 17% of the mean — the high- and low-suitability tasks sit side by side inside the same job.5 That is why aggregate “job loss” forecasts have a poor track record: they treat a job as automatable if its modal task is, ignoring the residual tasks that keep the human employed and often grow in value as the routine ones are stripped away.1
David Autor’s 2015 essay is the clearest statement of why automation has not “wiped out a majority of jobs.” His central observation is that commentators “tend to overstate the extent of machine substitution for human labor and ignore the strong complementarities between automation and labor that increase productivity, raise earnings, and augment demand for labor.”1 The mechanism is the O-ring production function: when a process is a chain of tasks and any weak link can sink the whole, making one link cheaper and more reliable raises the value of the remaining human links rather than eliminating them.1
The canonical case is the ATM. US bank-teller numbers were widely expected to collapse; instead they rose modestly from roughly 500,000 to 550,000 between 1980 and 2010 even as ATMs quadrupled from about 100,000 to 400,000. ATMs cut the cost of running a branch, so banks opened more branches, and the teller’s role shifted from cash handling toward relationship banking.1 The task was automated; the job was redefined and persisted. Critically, complementarity does not guarantee a worker keeps the gains: if the complementary tasks are ones anyone can supply, wages need not rise.1
Tasks that cannot be substituted by automation are generally complemented by it… productivity improvements in one set of tasks almost necessarily increase the economic value of the remaining tasks. David H. Autor, “Why Are There Still So Many Jobs?”, Journal of Economic Perspectives, 2015
Daron Acemoglu and Pascual Restrepo gave the task view a formal engine. In their model, technology shifts the “task content of production” through three forces. Automation lets capital take over tasks labor used to do — a displacement effect that, on its own, lowers labor demand.3 Automation also raises productivity (the productivity effect), which can offset displacement by making everything cheaper and expanding output. And the creation of new tasks in which labor has a comparative advantage produces a reinstatement effect that pulls labor back into a widening range of work.3 Long-run employment depends on the balance.2
The model’s sharpest warning concerns how good the automation is. Acemoglu’s later macro work argues that AI driven purely by task-level cost savings — “so-so” automation that displaces labor without generating large productivity gains — is the worst of both worlds. Applying a version of Hulten’s theorem and taking about 27% average task-level labor cost savings, he estimates the total-factor-productivity boost from AI at no more than about 0.66% over ten years.11
If tasks are the unit, the operational question becomes: which tasks is the current technology good at? Brynjolfsson and Mitchell answered this for machine learning in Science (2017) with a Suitability-for-Machine-Learning (SML) rubric.4 ML is well-suited to a task when it maps well-defined inputs to well-defined outputs; large labeled datasets exist; there is clear feedback; no long chains of reasoning are required; no detailed explanation is needed; some error is tolerable; the function is stable over time; and no specialized physical dexterity is required.4 The framing rests on Polanyi’s paradox — “we know more than we can tell.” Tasks built on tacit knowledge historically resisted automation because they could not be coded into rules; ML partially circumvents this by inferring the mapping from data — but only where those data and clear feedback exist.4
The empirical payoff came in the 2018 companion paper. Scoring O*NET tasks against the rubric, the authors found that few if any occupations have all tasks that are SML, and that capturing ML’s value “will require significant redesign of the task content of jobs.”5 They also found SML correlates only weakly with wages (about −0.14 with the wage percentile), so the ML wave would touch a different slice of the workforce than earlier routine-biased automation.5
Good ML task: clear inputs to clear outputs, abundant labeled data, tolerant of error, stable over time, no long reasoning chains, no explanation required, no fine motor skills. Bad ML task: tacit judgment, sparse feedback, high stakes, shifting conditions, accountability requirements. Most jobs are a mix of both — which is why the answer is task redesign, not wholesale replacement.4
Two later efforts quantified exposure at scale and overturned a key intuition. Felten, Raj and Seamans built the AI Occupational Exposure (AIOE) measure bottom-up, linking AI application areas to 52 human abilities in O*NET — and deliberately neutral on whether AI substitutes for or complements labor.6 The widely cited result is that the most-exposed occupations tend to be higher-paid and higher-education — the reverse of routine-biased automation, which hit the middle.6
Eloundou, Manning, Mishkin and Rock (“GPTs are GPTs,” 2023) applied the same logic to large language models. Their headline: around 80% of the US workforce could have at least 10% of their tasks affected by LLMs, and roughly 19% could see at least 50% of their tasks affected.7 The leverage comes from complementary software: LLMs alone could speed up about 15% of tasks, but tools built on top of them raise that to between 47% and 56%.7 Like AIOE, exposure rose with income, and the authors conclude LLMs exhibit the traits of a general-purpose technology — though “exposed” measures technical potential and is explicitly silent on augmentation versus displacement.7
Even where AI fully automates a task, total human work in the surrounding system need not fall — because cheaper tasks induce more demand. In The Coal Question (1865), William Stanley Jevons observed that “it is wholly a confusion of ideas to suppose that the economical use of fuel is equivalent to a diminished consumption. The very contrary is the truth.”18 Jevons drew the explicit labor analogy: “The economy of labour effected by the introduction of new machinery throws labourers out of employment for the moment. But such is the increased demand for the cheapened products, that eventually the sphere of employment is greatly widened.”18
The mechanism is induced demand: when a task gets cheaper, whether total spending and employment rise or fall depends on the price elasticity of demand for the output.4 The paradox is now invoked for AI: Microsoft’s Satya Nadella argued in January 2025 that “as AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can’t get enough of.”20 Stanford’s AI Index reports a live instance — software-engineer headcount “expected to increase, consistent with the Jevons Paradox,” even as 31% of surveyed executives expected AI to reduce overall workforce size versus 19% expecting an increase.21
Jevons is a tendency, not a law. It holds when demand for the cheapened output is elastic and unsatisfied; it fails when demand is saturated or externalities are not priced in. The induced-demand effect on radiology is observed (imaging volumes rose about 25% from 2018 to early 2025),15 but whether it generalizes across white-collar tasks at AI’s current capability is not settled by the captured evidence.
The most-cited prediction of AI-driven job loss is Geoffrey Hinton’s. In 2016, onstage at a machine-learning conference in Toronto, Hinton said it was “completely obvious” that within five years AI would outperform humans at radiology tasks, and that people “should stop training radiologists now.”15 The prediction is a textbook job-level error: it conflated one task — reading scans — with the whole job.
The reality a decade on runs the other way. The number of active US radiologists has grown roughly 10% over ten years, into a documented shortage.15 Mayo Clinic — far from cutting radiologists — expanded its radiology workforce by about 55% since 2016 while building a 40-person AI team that has licensed or developed more than 250 AI models.14 Average US radiologist compensation reached about $571,000 in 2025, up 9% year over year.15 The task explanation is exact: reading images is one of many radiologist tasks, so automating it lets doctors reallocate time to the rest — and reimbursement rules plus accountability for a missed diagnosis keep a licensed human in the loop.15
Hinton 2016. The captured corpus traces Hinton’s prediction to a 2016 Toronto machine-learning event, with the wording reproduced in Fortune (May 2026) and corroborated by TechCrunch (May 2025).1514 The original talk recording was not independently retrieved; the wording is cited to these reporting sources, not a primary transcript. The best-attested verbatim line is “completely obvious that within five years deep learning is going to do better than radiologists… we’ve got plenty of radiologists already”; “stop training radiologists now” is the widely reproduced reporting paraphrase.
”AI won’t replace you, but someone using AI will.” Attribute as “popularized by Lakhani” (HBR, Aug 2023),17 not “coined by” — the maxim is a folk saying with no single verifiable first author. The corpus does not assert a definitive originator.
If the question is task-level, leaders need a task-level decision rule. Two dimensions from the captured evidence do most of the work. The first is knowledge type: is the task explicit (codifiable, with feedback) or tacit (Polanyi’s “we know more than we can tell”)? This maps onto the SML rubric.4 The second is cost of error: where errors are catastrophic, accountability and explanation requirements keep a human in the loop.4 Agrawal, Gans and Goldfarb sharpen the split: AI is a drop in the cost of prediction, which raises the value of the complementary human input they call judgment — so the human’s residual role concentrates in the high-stakes, hard-to-specify tasks.13
↑ HIGH cost of error
Augment with verification. AI can draft, but errors are expensive — keep human review and accountability. E.g. radiology scan-read with physician final sign-off.15
Keep human-led. Judgment, no clear feedback signal — ML’s weakest zone. AI assists at the margins only. E.g. comforting a patient, high-stakes negotiation, novel strategy.4
Automate. Clear inputs and outputs, abundant data, error-tolerant — the sweet spot for ML. Redesign the job around the freed-up time. E.g. classification, transcription, routine drafting.5
Augment and experiment. Hard for AI unaided, but cheap mistakes make it a low-risk place to pilot human+AI workflows. E.g. brainstorming, first-draft creative work.25
EXPLICIT (codifiable) ←— knowledge type —→ TACIT (Polanyi)
The frame is descriptive of where the evidence already points. Real-usage data from the Anthropic Economic Index — about a million Claude conversations mapped to O*NET tasks — found a 57%/43% lean toward augmentation over automation, with only about 4% of jobs using AI for at least 75% of their tasks.25 That is the task-based view confirmed in live behavior. The honest counter-signal, recorded in full: the Stanford “Canaries in the Coal Mine?” working paper (Nov 2025) finds a 16% relative employment decline for workers aged 22–25 in the most AI-exposed occupations, concentrated where AI automates rather than augments.24 This is consistent with the task frame: displacement shows up first in the automate-not-augment quadrant. The strategic implication is to run the analysis at the level AI actually operates — map the job to its tasks, score each on knowledge type and cost of error, and redesign the job around what remains, because the value comes from re-bundling tasks, not cutting headcount.517