Data: The New (Final) Frontier
The bottom line benefit of artificial intelligence has put datain the spotlight over the last number of years. As data is finally taking its rightful place at the “center” of everything, the visibility and increased awareness mean greater responsibility.
While data has always been important in financial services, it has not necessarily been given the spotlight it deserved. In the old days, data was written into manual ledgers to keep track of inventory, account balances, expenses, etc. If anyone wanted to know the balance of an account, a person was responsible for checkinga physical ledger to provide the answer, or in other words, the data. They were the original data managers.
Along came automation where physical ledgers became files on a computer (eventually databases) and more people had access to find the data they were looking for. It was a virtual replication of the original physical process and ledger. Enter the age of distributed computing. Now, data would need to be shared across these distributed systems and as more and more systems were developed, the more complex the web of interaction became.
These distributed systems needed quick access to the data and they needed it in a format that they could use and understand. As a result, copies of the data were being stored in their proprietary format resulting in several difficulties.
The first challenge was to ensure all of these systems had the same values for the data. Next,you had to make sure the systems were updated on a timely basis with any changes or new data. Lastly, with all of these copies of data (and in different formats), it became difficult to know where to get the data you needed (you had many options but were not aware of the implications of your choices).
This dilemma forced thenext generation of data managersto be technologists. As the second-generation data managers, they had to solve the problem of moving data from system to system, analyze the system’s data needs, identify where to obtain the data, decide how the data would be kept up-to-date and how to transform data from one format to another system’s format.
While data has always been important in financial services, it has not necessarily been given the spotlight it deserved.
Most of the early use cases were for operational purposes, but some visionary companies recognized that data was an asset that could be leveraged and mined for valuable insights. These companies developed a new role that would focus exclusively on data. While the titles and responsibilities varied, this was the first time that a person was dedicated to data.
After the financial crisis, the creation of the Chief Data Officer (CDO) became commonplace across financial services. The CDO became a necessity and not having one was a liability.
The initial Chief Data Officers were exclusively focused on defining and implementing data management and data governance practices into the organization to improve the quality of data. This meant the CDO needed to partner with the business, operations and technology to reduce the complexity by converging the disparate data into actively managed data sources.
Performing data management introduced a new methodology and a primitive toolset to manage data through a process to ensure the integrity and quality of data and provide a clear source of data to those who needed it. However, these processes and toolsets could not keep up with the volume of data and the complexity of the environment, so data management had to prioritize and identify the most critical data elements and critical business processes to ensure its efforts would have a positive impact on what was most important to the firm. The problem: demand outpaced the ability to deliver as the data processes were new and mostly manual.
Amplifying the demand was the fact that more companieswere becoming aware of the value of data and how it could change their entire business model. Through the use of analytical tools, businesses were able to improve their customers’ experience, provide new and innovative products and give information to their salesforce to enhance services. However, these analytical tools require vast amounts of data from a multitude of new (and existing) sources. As the uses of data continues to multiply, so will the sources of data and the ability for data management to keep up with demand will become more and more difficult.
Something needed to change. In order to keep pace with the vast amount of data coming into the organization, we need to automate the acquisition, identification, classification and quality checking processes.
We need “the machine”(i.e. machine learning software) to do the heavy lifting and perform the functions of a traditional data analyst. Ultimately, the acquisition of a new data set should be fully automated. Once the data (and metadata) is obtained, the machine would perform data discovery based on the training and continual learning from previous data (via “supervised learning”). It would tag and catalog the data against the business dictionary and perform data quality analysis to identify flaws in the data and report them to an analyst. Any data that the machine cannot figure out would escalate to a data analyst who would then encode the new input into the machine as learning.
Over time, the machine would be able to perform more and more of the data management process as the data analyst would only need to focus on the complex (or new) data that the machine is not yet able to interpret.
This process is not revolutionary, but evolutionary. We need to continue to push data management professionals, tool vendors and technologists to continue to move us along this continuum. It is through these efforts that we will be able to keep pace with the continually growing demand on our path to a truly data-driven world.