Big Data IoT Revolution has Embraced Even the Oldest of Industries
The last decade proved that Big Data is indeed the future. Every industry is pursuing it. From the auto industry to farming, we are seeing an acceleration of generated data that is tucked away into massive Data Lakes.
The automobile revolution, including continuous auto-health monitoring, one-push driver assistance, and fully connected cars, is pushing the frontier of data collection. New cars produced around 2020 will on average produce over four terabytes of data every day. Dealing with this massive amount of data will challenge and completely revolutionize the big data technology stack.
This growth in Big Data is not only restricted to consumer technologies; agriculture is going through a revolution in the area of Internet of Things (IoT). Sensors and actuators all across the already industrialized farm lands are producing thousands, if not millions of data measurements every second. The fight is on to revolutionize the experience of a farmer as more companies are pouring billions of dollars in ‘Big Ag Big Data’ research. From auto-piloted drones to deliver pesticides, to weather-proofed buried sensors giving second-by-second updates on the micro climate where plants grow, the Big Data IoT revolution has completely embraced even the oldest of industries.
With all this available data, expectations for reduced costs and bigger returns are high. However, many companies are realizing that the grass on the other “big data side” is not as green as they thought. While Big Data has provided boosts to the bottom line, the next generation of Big Data will require substantial investment into Big Insights, not just Big Data. This represents one of the biggest transformational changes yet to be seen in the IT world.
This article discusses five important concepts that are sweeping the Big Data world..
Cloud transformation started with Infrastructure-as-a- Service (IaaS) revolution. IaaS provides IT Operations with the ability to seamlessly rent out servers without needing to physically maintain infrastructure within an organization. As industry catches up to this, Platform-as-a-Service (PaaS) is becoming the new reality. PaaS is further alleviating the IT pains by relieving them of mundane tasks like OS maintenance, and allowing them to focus on better goals, like cyber defense and faster compliance, all at a better cost.
The true revolution in cloud transformation is the nascent Serverless Computing technologies. Serverless Computing can truly relieve the need to maintain servers, operating systems and even applications to a large extent. Customers now need to just produce their code and deliver it to the cloud service provider to worry about the rest. Actions are packaged into cleanly defined web functions, which are hosted on the cloud and scaled on demand. Offering functionality via web Application Programming Interface (API) not only allows internal applications to scale with business, but also enables the companies to sell selected services externally. Such APIs are opening up opportunities for organizations to monetize on their data lakes by extracting insights from the data.
Today, data lake philosophy promotes a culture of stacking away data and not really asking the important question of what the data is going to be used for. With data flowing into organizations at lightning speeds, it’s becoming important to focus on extracting meaning from data early in the data creation/acquisition process. More organizations have to deal with garbage or irrelevant raw data occupying mindboggling volumes in the lake. The problem is starting to become significant enough to give rise to real time analytics. Rather than just storing away data, real time analytics focuses on storing ‘insights’. But, this is not easy and it requires sophisticated analytics, and sophisticated resources.
Organizations are investing in Data Science teams to condense the large volumes of data through algorithms and store away only meaningful information. Technologies like Apache Storm, Spark, and Azure Streaming Service, are starting to lay the foundation of real-time use of data science in decision making. Unlike the traditional Financial Service Fraud models that used older platforms and proprietary solutions, which took over six months to refresh data models, the newer real-time platforms have ubiquitous application potential, capable of transacting at millisecond speeds, providing instantaneous model refresh capability.
Machine Learning (ML) is very closely related to analytics. In fact, a lot of the techniques used in traditional statistics and analytics form the foundation for modern Machine Learning and Artificial Intelligence solutions. The key difference though is the fact that Machine Learning techniques rely on human labeled data to get into a continuous learning loop with their algorithms.
While ML represents a great opportunity for almost all industries, enabling production ready ML systems require platforms that are capable of state-of-the-art data ingestion, data transformation and model hosting. Most organizations stop short when it comes to implementing such production-ready ML IT solutions. The key is to embrace Change Management to develop a new DevOps model that works seamlessly with engineers and scientists alike, while driving the entire company towards an analytics-driven organizational model.
The goal of moving towards an analytics-driven organization is to allow faster innovation and to experiment with new ideas fast. This is especially true when it comes to customer facing departments, solutions and products. Fail-fast experimentation, especially social-media driven A/B testing is rapidly changing the way companies are reaching out and experimenting with their customers.
Starting with web technology companies to traditional retail companies, organizations are moving to innovative fail-fast experimentation and pushing the frontiers of traditional marketing. Setting up and running these A/B experiments is hard and it requires a lot of back and forth between various parts of the organization before a viable outcome can be obtained.
Testing-in-Production (TiP) as the name implies, reduces the typical DevOps cycle down such that developers can push out solutions to the customers with little or no delay. Usually DevOps takes long because of the traditional Dev, Test and Release cycles.
TiP approaches this problem in an innovative way. For a start, it’s common to define a metric of customer dissatisfaction that can be measured very quickly and effectively through one of various customer touchpoint channels. With this metric defined, developers are given near production release authority. The platform is designed to allow new code release, to serve a new experience, to a randomly selected group of customers, while holding back a control customer population for comparison. The customer dissatisfaction is constantly measured on both groups and even the slightest increase in dissatisfaction will auto trigger the platform to revert to the old experience, thereby rejecting the new code release. This enables the organization to rapidly innovate with little or no friction in the ideation process.
In essence, a series of new concepts are emerging to support Big Insights from Big Data in traditional industries. It will be important for leaders to pay attention to these technologies and avoid being drawn only into ‘Data Lake’ discussions.