
Sign up to save your podcasts
Or


NinjaAI.com
General purpose technologies [7], such as the steam engine and the computer, have historically been strong
drivers of economic growth, impacting a broad range of sectors and accelerating this impact with each new
technical advancement. In the last several years, generative AI has come to the fore as the next candidate
general purpose technology [17], capable of improving or speeding up tasks as varied as medical diagnosis [27]
and software development [14]. These capabilities are reflected in the astounding rate of AI adoption: nearly
40% of Americans report using generative AI at home or work, outpacing the early diffusion of the personal
computer and the internet [6]. Given this widespread adoption and potential for economic impact, a crucial
question is which work activities are being most affected by AI and, by extension, which occupations.
We provide evidence towards answering this question by identifying the work activities performed in
real-world usage of a mainstream large language model (LLM)-powered generative AI system, Microsoft
Bing Copilot (now Microsoft Copilot). We analyze 200k anonymized user–AI conversations, which were
automatically scrubbed for any personally identifiable information, sampled representatively from 9 months
of Copilot usage in the U.S. during 2024. A key insight of our analysis is that there are two distinct ways in
∗This study was approved by Microsoft IRB #11028. We thank Jennifer Neville, Ashish Sharma, Hancheng Cao, the
Microsoft Research AI Interaction and Learning Group, and the Microsoft Research Computational Social Science Working
Group for helpful discussions and feedback, and David Tittsworth, Jonathan McLean, Patrick Bourke, Nick Caurvina, and Bryan
Tower for software and data engineering support. Correspondence to: [email protected], [email protected],
1which a single conversation with an AI assistant can affect the workforce, corresponding to the two parties
engaged in conversation. First, the user is seeking assistance with a task they are trying to accomplish; we
call this the user goal. Analyzing user goals allows us to measure how generative AI is assisting different
work activities. In addition, the AI itself performs a task in the conversation, which we call the AI action.
Classifying AI actions separately lets us measure which work activities generative AI is performing. To
illustrate the distinction, if the user is trying to figure out how to print a document, the user goal is to
operate office equipment, while the AI action is to train others to use equipment.
To measure how AI usage indicates potential occupational impact, we classify conversations into work
activities as defined by the O*NET database [29], which decomposes occupations hierarchically into the work
activities performed in those occupations. We measure how successfully different work activities are assisted
or performed by AI, using both explicit thumbs up and down feedback from users and a task completion
classifier. To distinguish between broad and narrow AI contributions towards work activities, we also classify
the scope of AI impact demonstrated in each conversation toward each matching work activity. From these
classifications, we compute an AI applicability score for each occupation. This score captures if there is non-
trivial AI usage that successfully completes activities corresponding to significant portions of an occupation’s
tasks.
Our user goal vs. AI action distinction, combined with their classification into work activities, relates
to a key question in the literature and public discourse around
By Jason Wade, Founder NinjaAI3
22 ratings
NinjaAI.com
General purpose technologies [7], such as the steam engine and the computer, have historically been strong
drivers of economic growth, impacting a broad range of sectors and accelerating this impact with each new
technical advancement. In the last several years, generative AI has come to the fore as the next candidate
general purpose technology [17], capable of improving or speeding up tasks as varied as medical diagnosis [27]
and software development [14]. These capabilities are reflected in the astounding rate of AI adoption: nearly
40% of Americans report using generative AI at home or work, outpacing the early diffusion of the personal
computer and the internet [6]. Given this widespread adoption and potential for economic impact, a crucial
question is which work activities are being most affected by AI and, by extension, which occupations.
We provide evidence towards answering this question by identifying the work activities performed in
real-world usage of a mainstream large language model (LLM)-powered generative AI system, Microsoft
Bing Copilot (now Microsoft Copilot). We analyze 200k anonymized user–AI conversations, which were
automatically scrubbed for any personally identifiable information, sampled representatively from 9 months
of Copilot usage in the U.S. during 2024. A key insight of our analysis is that there are two distinct ways in
∗This study was approved by Microsoft IRB #11028. We thank Jennifer Neville, Ashish Sharma, Hancheng Cao, the
Microsoft Research AI Interaction and Learning Group, and the Microsoft Research Computational Social Science Working
Group for helpful discussions and feedback, and David Tittsworth, Jonathan McLean, Patrick Bourke, Nick Caurvina, and Bryan
Tower for software and data engineering support. Correspondence to: [email protected], [email protected],
1which a single conversation with an AI assistant can affect the workforce, corresponding to the two parties
engaged in conversation. First, the user is seeking assistance with a task they are trying to accomplish; we
call this the user goal. Analyzing user goals allows us to measure how generative AI is assisting different
work activities. In addition, the AI itself performs a task in the conversation, which we call the AI action.
Classifying AI actions separately lets us measure which work activities generative AI is performing. To
illustrate the distinction, if the user is trying to figure out how to print a document, the user goal is to
operate office equipment, while the AI action is to train others to use equipment.
To measure how AI usage indicates potential occupational impact, we classify conversations into work
activities as defined by the O*NET database [29], which decomposes occupations hierarchically into the work
activities performed in those occupations. We measure how successfully different work activities are assisted
or performed by AI, using both explicit thumbs up and down feedback from users and a task completion
classifier. To distinguish between broad and narrow AI contributions towards work activities, we also classify
the scope of AI impact demonstrated in each conversation toward each matching work activity. From these
classifications, we compute an AI applicability score for each occupation. This score captures if there is non-
trivial AI usage that successfully completes activities corresponding to significant portions of an occupation’s
tasks.
Our user goal vs. AI action distinction, combined with their classification into work activities, relates
to a key question in the literature and public discourse around

1,263 Listeners

227 Listeners