Recent Trends in Machine Learning
Sears is a senior leader with expertise in the areas of data science and analytics as well as enterprise and internet technology. Over the past 15 years, Sears has spent time leading and innovating in numerous industries, including healthcare, telecommunications, and financial services. Sears currently leads MassMutual’s technology strategy, enterprise architecture, and data science & advanced analytics functions. Sears is a frequent speaker at industry conferences and has numerous patents and publications in the areas of machine learning, technology and financial services. Sears has been recognized as one of the life insurance industry’s top 25 innovators under 40 by LIMRA and a top 100 innovator by Corinium Intelligence.
Machine learning and artificial intelligence have experienced an explosion in both scientific advancement and commercial investment. As a result, any firm with a need and a desire to deploy machine learning for the purposes of business optimization or growth can easily do so. In this article, we briefly review trends in the fundamental building blocks of machine learning: technology, data collection, and algorithmic methods. We then touch on emerging social risks and emerging regulatory responses.
Machine learning technology has become increasingly more specialized and has co-evolved with data collection and methods building blocks. Originally, machine learning technology leveraged the basic components of computers; memory, storage, and processors. As data sets grew in size and algorithms grew in complexity, machine learning technology became more complex. Today, machine learning systems are optimized for large data by using distributed computing architectures, and for the newest complex algorithms by using banks of graphical processing units. Enterprise software firms, cloud providers, innovative teams within large firms and startups have all developed frameworks and services to enable effective use of these powerful capabilities. These services now allow machine learning models to be trained and deployed easily and automatically.
Data of all types used for machine learning continues to become more readily available and accessible as consumers and businesses digitally transform their habits and processes. This transformation has led to the adoption and deployment of other ancillary technologies that generate data suitable for machine learning algorithms. Environmental sensors, financial services transactions and statements, medical devices and tests, video, image and natural language systems are being used in every corner of our economy and social systems.
Data of all types used for machine learning continues to become more readily available and accessible as consumers and businesses digitally transform their habits and processes
Because of this proliferation of data, firms specializing in the curation of this information and associated metadata have begun to grow. While it is now possible to purchase data from online brokerages, it is also possible to scale up data collection and labeling efforts using crowdsourcing platforms as well. As we will discuss, the ease of use and scale of operations associated with these data platforms are two major drivers of emerging regulatory scrutiny and consumer apprehension of pervasive use of machine learning in applications that power the economy and touch consumers.
Machine learning methods have a rich history. Over the past decade, statistical methods have evolved from traditional linear and tree-based methods that could be trained with relatively small data sets to complex, non-linear models containing millions of parameters and requiring massive amounts of data for training. This explosion in method complexity co-evolved with the explosion of data and computing power.
Today, deep learning algorithms are capable of extracting patterns from increasingly complex data domains, including categorizing articles, classifying objects in images and video, and forecasting certain classes of time series. Machines can now perform these tasks better more accurately than humans. However, higher order tasks as well as learning causal relationships directly from data are still out of reach.
The rapid and comprehensive evolution of technology, data collection, and algorithmic methods in the field have led to wide adoption of the capabilities. Today, machine learning powers large parts of the economy. In financial services, tasks including trading stocks, issuing credit cards and insurance policies are common. In healthcare and life science, machine learning is used to forecast flu outbreaks, detect cancer, and develop new drugs. Major telecommunications firms use machine learning to detect fraud and forecast customer churn and bandwidth demand. As businesses and governments have taken up the use of machine learning capabilities, the social and economic risks surrounding their use, such as error, bias and interpretability, have grown to the point where consumer activists, government policy makers, and large firms have taken notice and have begun to think critically about proper governance of these powerful tools.
Ensuring that machine learning algorithms make consistent decisions and avoid harmful discrimination, while respecting consumers’ privacy, has become a priority. Large firms including Google and Facebook have published principles for responsible use of these tools and have created internal teams and systems to monitor and certify their algorithms. In Europe and the United States, regulatory bodies have begun to publish regulations and release their own principles for responsible use of data and algorithms. Examples include Europe’s General Data Protection Regulation, California’s Consumer Privacy Act, and the National Association of Insurance Commissioners AI Principles.
Machine learning is used pervasively to power the economy and improve quality of life on a global scale. Machine learning makes it possible to instantly access credit, assist in the diagnosis of diseases, and detect criminal acts. At the same time, machine learning has also amplified the need to have transparent and robust approaches to the governance of these algorithms and the data that power them.