Without data on how artificial intelligence is affecting jobs, policymakers will fly blind into the next industrial revolution, warn Tom Mitchell and Erik Brynjolfsson.
A robot delivers takeaway food to customers in a trial in London.Photo: John Phillips/Getty
Advances in technology pose huge challenges for jobs. Productivity levels have never been higher in the United States, for example, but income for the bottom 50% of earners has stagnated since 1999 (see ‘Job shifts’). Most of the monetary gains have gone to a small group at the very top. Technology is not the only reason, but it is probably the most important one.
A report published on 13 April by the US National Academies of Sciences, Engineering, and Medicine details the impacts of information technology on the workforce1. We co-chaired the report committee and learnt a great deal in the process — including that, over the next 10–20 years, technology will affect almost every occupation. For example, self-driving vehicles could slash the need for drivers of taxis and long-haul trucks, and online education could enrich options for retraining of displaced workers.
Most important, we learnt that policymakers are flying blind into what has been called the fourth industrial revolution or the second machine age. There is a remarkable lack of data available on basic questions, such as: what is the scope and rate of change of the key technologies, especially artificial intelligence (AI)? Which technologies are already eliminating, augmenting or transforming which types of jobs? What new work opportunities are emerging, and which policy options might create jobs in this context?
At best, this paucity of information will lead to missed opportunities. At worst, it could be disastrous. If we want to understand, prepare for and guide the unpredictable impacts of advancing technology, we must radically reinvent our ability to observe and track these changes and their drivers.
Fortunately, many of the components of a fit-for-purpose data infrastructure are already in place. Digital knowledge about the economy is proliferating and has unprecedented precision, detail and timeliness. The private sector is increasingly adopting different approaches to generating data and using them in decision-making, such as A/B testing to compare alternatives. And technologies that protect privacy while allowing statistical summaries of large amounts of data to be shared are increasingly available.
We call for the creation of an integrated information strategy to combine public and privately held data. This would provide policymakers and the public with ways to negotiate the evolving and unpredictable impacts of technology on the workforce. Building on this, we call for policymakers to adopt an evidence-based ‘sense and respond’ approach, as pioneered by the private sector.
These are big changes, but the stakes for workers and the economy are high.
Much of the data needed to spot, understand and adapt to workforce challenges are not gathered in a systematic way, or worse, do not exist. The irony of our information age is that despite the flood of online data, decision-makers all too often lack timely, relevant information.
For instance, although digital technologies underpin many consumer services, standard US government data sources — such as the Current Population Survey conducted by the Bureau of Labor Statistics — don’t accurately capture the rise of the contingent or temporary workforce because they do not ask the right questions. Researchers and private-sector economists have tried to address this by commissioning their own surveys2, but these lack the scale, scope and credibility of government surveys. Government administrative data, such as tax forms, provide another potentially valuable data source, but these need to be integrated with government survey data to provide context and validation3.
Similarly lacking are metrics to track progress in the technologies and capabilities of AI. Moore’s law (that microprocessor performance doubles every two years or so) captures advances in the underlying semiconductors, but it does not cover rapid improvements in areas such as computer vision, speech and problem solving. A comprehensive index of AI would provide objective data on the pace and breadth of developments. Mapping such an index to a taxonomy of skills and tasks in various occupations would help educators to design programmes for the workforce of the future. Non-governmental groups, such as the One Hundred Year Study on Artificial Intelligence at Stanford University in California, are taking useful steps, but much more can and should be done at the federal level.
Happily, we are in the middle of a digital data explosion. As companies have come to understand the power of machine learning, they have begun to capture new kinds of data to optimize their internal processes and interactions with customers and suppliers. Most large companies have adopted software and data infrastructures to standardize and, in many cases, to automate tasks — from managing inventories and orders to handling staff holidays. Internet companies such as Amazon and Netflix routinely capture massive amounts of data to learn which products to show customers next, increasing sales and satisfaction. These lessons about real-time data collection — and the data themselves — can also be valuable to governments.
For example, websites for job-seekers contain data about millions of posts, the skills they require and where the jobs are. Universities have detailed information about how many students are taking which courses, when they will graduate and with which skills. Robotics companies have customer data showing demand for different types of automated assembly system. Technology-platform companies have data about how many freelance workers they employ, the hours they work and where. These sorts of information, if connected and made accessible in the right way, could give us a radically better picture of the current state of employment.
But hardly any such data are being shared now between organizations, and so we fail to capture their societal value. Reasons include the unwillingness of companies to divulge data that might be used by competitors. Privacy issues, cultural inertia and regulations against sharing are other obstacles.
Taking advantage of existing data needs a change in mindset4. Over the past decade, many corporations have moved from a ‘predict and plan’ approach to a ‘sense and respond’ one, which allows them to adapt quickly to a rapidly changing environment. By continuously collecting massive volumes of real-time data about customers, competitors, suppliers and their own operations, companies have learnt how to evolve their strategies, product offerings and profitability. The number of manufacturing firms adopting a data-driven approach to decision-making has more than tripled since 2005, reflecting the improvements it can bring to profitability and effectiveness5.
The most nimble firms run real-time experiments to test different policies and products. For example, Internet companies routinely run A/B tests: presenting customers with different interfaces, measuring which is most effective, then adopting the most successful. We discussed this approach with Sebastian Thrun, founder of the online education provider Udacity. In this way, the company learnt that it can dramatically improve retention of people on its courses by requiring students to apply for admission before beginning the course. Counter-intuitively, it also found that raising its prices in China tripled overall demand for its services.
Governments can and must learn the lessons of data-driven decision-making and experimentation. In the face of rapid and unpredictable changes that have unknown consequences, they need to be able to observe those changes in real time, and to quickly test policy responses to determine what works. For example, the best policy for retraining displaced workers could be decided after trialling several different policies for workers within one region. The policies’ different impacts on employment could be observed for a year before moving forward with the one that produces the greatest re-employment. Authorities could continue to experiment to accommodate future changes.
One example of such an experiment was actually an accident. In 2008, the state of Oregon used a lottery process to randomize which of its citizens would be granted access to government health insurance (Medicaid), after an unexpected shortfall in state funding required funds to be rationed. The process provided invaluable information about the causal effects of the programme on health and well-being, and showed that Medicaid coverage led to an increase in preventive screening, such as for cholesterol6. There are many opportunities for more deliberate experimentation in government programmes. Because many are implemented in a phased process, some randomization can be done at little or no cost.
Digital data should not be treated as a substitute to information that is collected in more conventional ways by the government. It often makes government data more valuable, not less. Typically, the ‘digital exhaust’ data trail that is generated as a by-product of digitizing an organization’s processes, goods and services does not fully capture or represent the underlying phenomena. For example, according to our analyses, Java programmers are well represented in databases of the employment-networking platform LinkedIn, but truck drivers are not. Not everyone has a smartphone, let alone a particular app. The use of digital payment tools, social networks or search engines varies across demographic categories and other variables of interest.
Although terabytes and exabytes of data are now available, they need to be calibrated and validated. The best way to do that is often through the kinds of systematic survey (such as a national census) and administrative data that the government collects. And, like industry, government should leverage more types of digital data that are collected as a by-product of its operations — for instance, automatic toll collections or taxes.
Collecting truly representative data will at times require the force of law for compliance and anonymity. It might also require new modes of public–private partnerships — including ways to incentivize the collection of data that are of great value to society but of little direct value to the private organization that is best positioned to collect them. This reflects the fact that information, which can often be shared at close to zero marginal cost, is the ultimate public good7. For example, job-placement websites might have little reason to publish statistics about which laid-off workers from one economic sector are getting new jobs of a certain type owing to skills obtained from a particular retraining programme. This holds true even if such trends are visible in their data, cost no money to share and are valuable to newly displaced workers.
We have spoken to leaders at private organizations including human-resource consultants Manpower in Milwaukee, Wisconsin; LinkedIn of Mountain View, California; and job-market analytics firm Burning Glass Technologies in Boston, Massachusetts. All have expressed an openness to such data sharing.
A rational public strategy for managing the jobs revolution calls for a clear and comprehensive picture of the changes. Obtaining that picture will require three things. First, we must find ways to collect data and statistical summaries from diverse sources, including private organizations. Second, a trusted broker is needed to protect data privacy, access, security, anonymity and other rights of data providers, and to provide summaries for the public (much as the US Census and other statistical agencies currently do). Third, we need ways to integrate data from sources that reflect different statistical sampling skews and biases, normalizing the data where possible and flagging any remaining biases.
This new information infrastructure should be integrated with existing core indexes that track key measures such as employment, earnings, recruitment, lay-offs, resignations and productivity — and combined with powerful data sources from the private sector. This will enable statistics and analysis to shed light on standard key indicators of the economy in the context of ongoing change.
Perfection here is not a prerequisite for utility — anything is better than flying blind. Investing in an infrastructure that enables continuous collection, storage, sharing and analysis of data about work is one of the most important and urgent steps any government can take.
- Nature 544, 290–292 (20 April 2017) doi:10.1038/544290a
- National Academies of Sciences, Engineering, and Medicine. Information Technology and the U.S. Workforce: Where Are We and Where Do We Go from Here? (National Academies Press, 2017).
- The Rise and Nature of Alternative Work Arrangements in the United States, 1995–2015 NBER Working Paper No. 22667 (National Bureau of Economic Research, 2016); available at http://go.nature.com/2nnusoe &
- Measuring the Gig Economy: Current Knowledge and Open Issues (paper presented at NBER Conference on Research in Income and Wealth: Measuring and Accounting for Innovation in the 21st Century, Washington DC 10–11 March 2017); available at http://go.nature.com/2ohelvc , , &
- Harvard Bus. Rev. 90, 61–67 (2012). &
- Am. Econ. Rev. 106, 133–139 (2016). &
- N. Engl. J. Med. 368, 1713–1722 (2013). et al.
- The Rate and Direction of Inventive Activity: Economic and Social Factors (ed. National Bureau Committee for Economic Research) 609–626 (Princeton Univ. Press, 1962); available at http://go.nature.com/2omibrh in