Paper 04

Task-Level, Not Job-Level

Jobs are bundles of tasks. AI acts on tasks, not job titles — which is why occupation-level predictions of mass displacement keep missing, and why the operative question is which tasks to hand over, not which jobs to cut.

25 verified sources A — The nature of the shift

A compiled, source-verified research digest — every claim cites a downloaded source, every figure is drawn from the data behind it. Not a personal essay.

Abstract

The dominant frame for AI and work — “which jobs go away” — is the wrong unit of analysis. Labor economics has converged on a task-based view: an occupation is a bundle of distinct tasks, technology substitutes for some and complements others, and the net effect on a job depends on that mix.1 When the unit shifts from job to task, the numbers move dramatically. A whole-job model put 47% of US employment at “high risk” of computerization;9 a task-based model of the same economies put the share of jobs that are highly automatable at roughly 9%.10 The 2017 Suitability-for-Machine-Learning rubric found that variation in automatability is mostly within occupations, not across them — few if any jobs are fully automatable, so realizing value requires re-bundling tasks, not replacing people.45 This paper traces the task-based view from Autor through Acemoglu & Restrepo, the empirical exposure measures, and the demand side (Jevons’ paradox), then offers a decision frame — cost-of-error by knowledge type — for where to deploy AI versus keep a human in the loop. The canonical cautionary tale, Geoffrey Hinton’s 2016 call to “stop training radiologists,” is examined against what has actually happened to the profession since.15

The unit-of-analysis problem

Public debate about AI and employment almost always runs at the level of the job title: will AI replace radiologists, paralegals, copywriters? The framing is intuitive and almost always misleading, because firms do not buy or automate “jobs” — they automate tasks. The task-based view, formalized by Autor, Levy and Murnane and now standard in labor economics, treats an occupation as a bundle of distinct but interrelated tasks; a technology substitutes for the tasks it can do cheaply and complements the tasks it cannot.1 The net effect on the job is the weighted result of both forces, and it can be positive or negative.4

The single most consequential fact this reframing produces: automatability varies far more within an occupation than across occupations. When Brynjolfsson, Mitchell and Rock scored 18,156 O*NET tasks across 964 occupations, the within-occupation standard deviation of task suitability was about 17% of the mean — the high- and low-suitability tasks sit side by side inside the same job.5 That is why aggregate “job loss” forecasts have a poor track record: they treat a job as automatable if its modal task is, ignoring the residual tasks that keep the human employed and often grow in value as the routine ones are stripped away.1

Change the unit, change the answerShare of jobs at high risk of automation, same economies, two methods0%20%40%47%Whole-job methodFrey & Osborne (2013)9%Task-based methodArntz, Gregory & Zierahn (2016)
Figure 1.Estimating the same thing — the share of employment at high risk of automation — with a job-level versus a task-level method yields a roughly five-fold difference (21-country average; the US task-based figure is also about 9%).Source: Arntz, Gregory & Zierahn (2016), OECD WP 189.

Autor: complementarity, polarization, and the O-ring

David Autor’s 2015 essay is the clearest statement of why automation has not “wiped out a majority of jobs.” His central observation is that commentators “tend to overstate the extent of machine substitution for human labor and ignore the strong complementarities between automation and labor that increase productivity, raise earnings, and augment demand for labor.”1 The mechanism is the O-ring production function: when a process is a chain of tasks and any weak link can sink the whole, making one link cheaper and more reliable raises the value of the remaining human links rather than eliminating them.1

The canonical case is the ATM. US bank-teller numbers were widely expected to collapse; instead they rose modestly from roughly 500,000 to 550,000 between 1980 and 2010 even as ATMs quadrupled from about 100,000 to 400,000. ATMs cut the cost of running a branch, so banks opened more branches, and the teller’s role shifted from cash handling toward relationship banking.1 The task was automated; the job was redefined and persisted. Critically, complementarity does not guarantee a worker keeps the gains: if the complementary tasks are ones anyone can supply, wages need not rise.1

Tasks that cannot be substituted by automation are generally complemented by it… productivity improvements in one set of tasks almost necessarily increase the economic value of the remaining tasks. David H. Autor, “Why Are There Still So Many Jobs?”, Journal of Economic Perspectives, 2015

Acemoglu & Restrepo: the displacement–reinstatement model

Daron Acemoglu and Pascual Restrepo gave the task view a formal engine. In their model, technology shifts the “task content of production” through three forces. Automation lets capital take over tasks labor used to do — a displacement effect that, on its own, lowers labor demand.3 Automation also raises productivity (the productivity effect), which can offset displacement by making everything cheaper and expanding output. And the creation of new tasks in which labor has a comparative advantage produces a reinstatement effect that pulls labor back into a widening range of work.3 Long-run employment depends on the balance.2

The model’s sharpest warning concerns how good the automation is. Acemoglu’s later macro work argues that AI driven purely by task-level cost savings — “so-so” automation that displaces labor without generating large productivity gains — is the worst of both worlds. Applying a version of Hulten’s theorem and taking about 27% average task-level labor cost savings, he estimates the total-factor-productivity boost from AI at no more than about 0.66% over ten years.11

Three forces on the task content of productionDisplacementCapital takes over taskslabor used to perform.↓ labor demandProductivityCheaper output expandsdemand across the economy.offsets displacementReinstatementNew tasks where labor hascomparative advantage.↑ labor demandNet effect on employment is the balance of the three. “So-so” automation triggers displacement with weakproductivity and no reinstatement. Acemoglu’s macro estimate: AI TFP gain no more than 0.66% over 10 years.
Figure 2.The Acemoglu–Restrepo task-content framework. Automation and new-task creation push labor demand in opposite directions; the productivity effect can offset displacement. The one quantitative figure shown is sourced.Source: Acemoglu & Restrepo (2019), JEP 33(2); Acemoglu (2024), NBER WP 32487.

What machine learning can actually do

If tasks are the unit, the operational question becomes: which tasks is the current technology good at? Brynjolfsson and Mitchell answered this for machine learning in Science (2017) with a Suitability-for-Machine-Learning (SML) rubric.4 ML is well-suited to a task when it maps well-defined inputs to well-defined outputs; large labeled datasets exist; there is clear feedback; no long chains of reasoning are required; no detailed explanation is needed; some error is tolerable; the function is stable over time; and no specialized physical dexterity is required.4 The framing rests on Polanyi’s paradox — “we know more than we can tell.” Tasks built on tacit knowledge historically resisted automation because they could not be coded into rules; ML partially circumvents this by inferring the mapping from data — but only where those data and clear feedback exist.4

The empirical payoff came in the 2018 companion paper. Scoring O*NET tasks against the rubric, the authors found that few if any occupations have all tasks that are SML, and that capturing ML’s value “will require significant redesign of the task content of jobs.”5 They also found SML correlates only weakly with wages (about −0.14 with the wage percentile), so the ML wave would touch a different slice of the workforce than earlier routine-biased automation.5

The SML rubric, in one line

Good ML task: clear inputs to clear outputs, abundant labeled data, tolerant of error, stable over time, no long reasoning chains, no explanation required, no fine motor skills. Bad ML task: tacit judgment, sparse feedback, high stakes, shifting conditions, accountability requirements. Most jobs are a mix of both — which is why the answer is task redesign, not wholesale replacement.4

Measuring exposure: AIOE and “GPTs are GPTs”

Two later efforts quantified exposure at scale and overturned a key intuition. Felten, Raj and Seamans built the AI Occupational Exposure (AIOE) measure bottom-up, linking AI application areas to 52 human abilities in O*NET — and deliberately neutral on whether AI substitutes for or complements labor.6 The widely cited result is that the most-exposed occupations tend to be higher-paid and higher-education — the reverse of routine-biased automation, which hit the middle.6

AI exposure inverts the old pattern0 (economy mean)more exposed →← less exposedGenetic counselors1.53Financial examiners1.53Actuaries1.52Accountants & auditors1.48Management analysts1.43Landscaping & groundskeeping−1.82Fence erectors−1.90Reinforcing iron & rebar workers−1.97Fitness trainers−2.11Dancers−2.67Top-exposed roles are analytic and white-collar; least-exposed are physical and interpersonal. Scores standardized (mean 0).
Figure 3.AI occupational exposure (AIOE), top-5 and bottom-5 of 774 occupations. Higher means more exposed. The pattern reverses routine-biased automation: cognitive analytic work is most exposed, manual and physical work least.Source: Felten, Raj & Seamans (2021), AIOE dataset (774 occupations).

Eloundou, Manning, Mishkin and Rock (“GPTs are GPTs,” 2023) applied the same logic to large language models. Their headline: around 80% of the US workforce could have at least 10% of their tasks affected by LLMs, and roughly 19% could see at least 50% of their tasks affected.7 The leverage comes from complementary software: LLMs alone could speed up about 15% of tasks, but tools built on top of them raise that to between 47% and 56%.7 Like AIOE, exposure rose with income, and the authors conclude LLMs exhibit the traits of a general-purpose technology — though “exposed” measures technical potential and is explicitly silent on augmentation versus displacement.7

~80%
of US workers could have at least 10% of tasks affected by LLMs
Eloundou et al. (2023)
9%
of jobs highly automatable under a task-based method (vs 47% job-level)
Arntz et al. (2016); Frey & Osborne (2013)
57 / 43
augmentation vs automation split across ~1M real AI conversations
Anthropic Economic Index (2025)
≤0.66%
estimated AI total-factor-productivity gain over ten years
Acemoglu (2024)

The demand side: Jevons’ paradox

Even where AI fully automates a task, total human work in the surrounding system need not fall — because cheaper tasks induce more demand. In The Coal Question (1865), William Stanley Jevons observed that “it is wholly a confusion of ideas to suppose that the economical use of fuel is equivalent to a diminished consumption. The very contrary is the truth.”18 Jevons drew the explicit labor analogy: “The economy of labour effected by the introduction of new machinery throws labourers out of employment for the moment. But such is the increased demand for the cheapened products, that eventually the sphere of employment is greatly widened.”18

The mechanism is induced demand: when a task gets cheaper, whether total spending and employment rise or fall depends on the price elasticity of demand for the output.4 The paradox is now invoked for AI: Microsoft’s Satya Nadella argued in January 2025 that “as AI gets more efficient and accessible, we will see its use skyrocket, turning it into a commodity we just can’t get enough of.”20 Stanford’s AI Index reports a live instance — software-engineer headcount “expected to increase, consistent with the Jevons Paradox,” even as 31% of surveyed executives expected AI to reduce overall workforce size versus 19% expecting an increase.21

Open question

Jevons is a tendency, not a law. It holds when demand for the cheapened output is elastic and unsatisfied; it fails when demand is saturated or externalities are not priced in. The induced-demand effect on radiology is observed (imaging volumes rose about 25% from 2018 to early 2025),15 but whether it generalizes across white-collar tasks at AI’s current capability is not settled by the captured evidence.

The radiologist case: a prediction stress-tested

The most-cited prediction of AI-driven job loss is Geoffrey Hinton’s. In 2016, onstage at a machine-learning conference in Toronto, Hinton said it was “completely obvious” that within five years AI would outperform humans at radiology tasks, and that people “should stop training radiologists now.”15 The prediction is a textbook job-level error: it conflated one task — reading scans — with the whole job.

The reality a decade on runs the other way. The number of active US radiologists has grown roughly 10% over ten years, into a documented shortage.15 Mayo Clinic — far from cutting radiologists — expanded its radiology workforce by about 55% since 2016 while building a 40-person AI team that has licensed or developed more than 250 AI models.14 Average US radiologist compensation reached about $571,000 in 2025, up 9% year over year.15 The task explanation is exact: reading images is one of many radiologist tasks, so automating it lets doctors reallocate time to the rest — and reimbursement rules plus accountability for a missed diagnosis keep a licensed human in the loop.15

”Stop training radiologists” (2016) vs the profession sinceMayo Clinic radiology workforce, since 2016+55%US imaging case load, 2018 to early 2025+25%Active US radiologists, 10-year change+10%Average radiologist pay, 2025$571Kup 9% year over yearProjected specialist shortfall42,000radiology and other specialties by 2033 (AAMC)Every captured indicator points to growth, not disappearance — the task (scan reading) was automatable; the job was not.Bars are not on a common scale; each label states its own metric and period.
Figure 4.The radiology reality check, a decade after the prediction.Source: Fortune (May 2026); TechCrunch (May 2025, corroborating the paywalled NYT); ACR/Neiman (Feb 2026).
Verification note — flagged claims

Hinton 2016. The captured corpus traces Hinton’s prediction to a 2016 Toronto machine-learning event, with the wording reproduced in Fortune (May 2026) and corroborated by TechCrunch (May 2025).1514 The original talk recording was not independently retrieved; the wording is cited to these reporting sources, not a primary transcript. The best-attested verbatim line is “completely obvious that within five years deep learning is going to do better than radiologists… we’ve got plenty of radiologists already”; “stop training radiologists now” is the widely reproduced reporting paraphrase.

”AI won’t replace you, but someone using AI will.” Attribute as “popularized by Lakhani” (HBR, Aug 2023),17 not “coined by” — the maxim is a folk saying with no single verifiable first author. The corpus does not assert a definitive originator.

A decision frame: cost-of-error by knowledge type

If the question is task-level, leaders need a task-level decision rule. Two dimensions from the captured evidence do most of the work. The first is knowledge type: is the task explicit (codifiable, with feedback) or tacit (Polanyi’s “we know more than we can tell”)? This maps onto the SML rubric.4 The second is cost of error: where errors are catastrophic, accountability and explanation requirements keep a human in the loop.4 Agrawal, Gans and Goldfarb sharpen the split: AI is a drop in the cost of prediction, which raises the value of the complementary human input they call judgment — so the human’s residual role concentrates in the high-stakes, hard-to-specify tasks.13

↑ HIGH cost of error

Explicit knowledge · High cost of error

Augment with verification. AI can draft, but errors are expensive — keep human review and accountability. E.g. radiology scan-read with physician final sign-off.15

Tacit knowledge · High cost of error

Keep human-led. Judgment, no clear feedback signal — ML’s weakest zone. AI assists at the margins only. E.g. comforting a patient, high-stakes negotiation, novel strategy.4

Explicit knowledge · Low cost of error

Automate. Clear inputs and outputs, abundant data, error-tolerant — the sweet spot for ML. Redesign the job around the freed-up time. E.g. classification, transcription, routine drafting.5

Tacit knowledge · Low cost of error

Augment and experiment. Hard for AI unaided, but cheap mistakes make it a low-risk place to pilot human+AI workflows. E.g. brainstorming, first-draft creative work.25

EXPLICIT (codifiable) ←— knowledge type —→ TACIT (Polanyi)

The frame is descriptive of where the evidence already points. Real-usage data from the Anthropic Economic Index — about a million Claude conversations mapped to O*NET tasks — found a 57%/43% lean toward augmentation over automation, with only about 4% of jobs using AI for at least 75% of their tasks.25 That is the task-based view confirmed in live behavior. The honest counter-signal, recorded in full: the Stanford “Canaries in the Coal Mine?” working paper (Nov 2025) finds a 16% relative employment decline for workers aged 22–25 in the most AI-exposed occupations, concentrated where AI automates rather than augments.24 This is consistent with the task frame: displacement shows up first in the automate-not-augment quadrant. The strategic implication is to run the analysis at the level AI actually operates — map the job to its tasks, score each on knowledge type and cost of error, and redesign the job around what remains, because the value comes from re-bundling tasks, not cutting headcount.517

References

  1. Autor, D. H. (2015). Why Are There Still So Many Jobs? The History and Future of Workplace Automation. Journal of Economic Perspectives 29(3), 3–30.
  2. Acemoglu, D. & Restrepo, P. (2016/2018). The Race Between Machine and Man. NBER WP 22252 / AER 108(6).
  3. Acemoglu, D. & Restrepo, P. (2019). Automation and New Tasks: How Technology Displaces and Reinstates Labor. Journal of Economic Perspectives 33(2), 3–30.
  4. Brynjolfsson, E. & Mitchell, T. (2017). What can machine learning do? Workforce implications. Science 358(6370), 1530–1534.
  5. Brynjolfsson, E., Mitchell, T. & Rock, D. (2018). What Can Machines Learn, and What Does It Mean for Occupations and the Economy? AEA Papers & Proceedings 108, 43–47.
  6. Felten, E., Raj, M. & Seamans, R. (2021). Occupational, Industry, and Geographic Exposure to Artificial Intelligence. Strategic Management Journal 42(12), 2195–2217.
  7. Eloundou, T., Manning, S., Mishkin, P. & Rock, D. (2023). GPTs are GPTs: An Early Look at the Labor Market Impact Potential of LLMs. arXiv:2303.10130.
  8. Frey, C. B. & Osborne, M. A. (2013/2017). The Future of Employment: How Susceptible Are Jobs to Computerisation? Oxford Martin School.
  9. Arntz, M., Gregory, T. & Zierahn, U. (2016). The Risk of Automation for Jobs in OECD Countries. OECD SEM Working Papers No. 189.
  10. Acemoglu, D. (2024). The Simple Macroeconomics of AI. NBER WP 32487.
  11. Agrawal, A., Gans, J. & Goldfarb, A. (2018). Prediction, Judgment, and Complexity. NBER WP 24243.
  12. Loizos, C. (2025). Radiologists aren’t going anywhere. TechCrunch, May 14 2025.
  13. Quiroz-Gutierrez, M. (2026). A decade after the ‘Godfather of AI’ said radiologists were obsolete, their salaries are up and demand is growing. Fortune, May 4 2026.
  14. Rula, E. Y. / ACR Neiman HPI (2026). The Radiologist Shortage: A Workforce Update. ACR Bulletin, Feb 5 2026.
  15. Lakhani, K. R. (2023). AI Won’t Replace Humans — But Humans With AI Will Replace Humans Without AI. Harvard Business Review, Aug 4 2023.
  16. Jevons, W. S. (1865). The Coal Question, Ch. VII. Macmillan (Yale-hosted excerpt).
  17. Northeastern Global News (2025). What is Jevons Paradox? (incl. S. Nadella, Jan 2025). Northeastern University, Feb 7 2025.
  18. Maslej, N. et al. / Stanford HAI (2025). Artificial Intelligence Index Report 2025 (Ch. 4, Economy). Stanford University.
  19. OECD (2023). OECD Employment Outlook 2023 — AI and the Labour Market.
  20. McKinsey Global Institute (2023). The economic potential of generative AI. McKinsey & Company.
  21. Brynjolfsson, E., Chandar, B. & Chen, R. (2025). Canaries in the Coal Mine? Six Facts about the Recent Employment Effects of AI. Stanford Digital Economy Lab WP, Nov 2025.
  22. Anthropic (2025). The Anthropic Economic Index. Anthropic, Feb 2025.