CIOREVIEW >> Artificial Intelligence >>

How Concept Drift Affects Machine Learning Models?

By CIOReview | Tuesday, July 16, 2019

Concept drift is an obstacle to the effective functioning of machine learning models, and hence, there is a need to continuously update the input data.

FREMONT, CA:  The data generated by components is prone to change over time. Leveraging this data can lead to poor analytical results. The change in data usually occurs due to varying operating and environmental conditions relative to the components. This phenomenon, often referred to as concept drift, affects the underlying mechanisms of the machine learning algorithms.

For instance, a concept in weather data might include a season that is not specified in the temperature data but can potentially affect it. Concept drift does not pose much of a problem in the case of static use cases. However, the relationship between input and output parameters is prone to change over time.

One way to overcome this problem is by maintaining these relationships and formulas in the cloud. As the parameter change, the new formulas and relationships can be updated in the cloud. However, in the case of edge computing, the links need to be transferred. The challenge lies in updating the changes in the low-cost sensors with lesser memory, which cannot accommodate over the firmware updates.

When machine learning models are deployed, the baseline performances such as accuracy, level of skill, and so, have to be recorded and monitored continuously for changes.  A significant variation can be an indication of potential concept drift and has to be fixed before it can affect the analytics data. Organizations should expect concept change to occur and update the model periodically. However, the major challenge lies in updating the data in the edge sensors without causing downtime. Hence, it is advisable to develop sensors with the inbuilt capacity to accept model updates without the need for updating entire frameworks.

Concept drift can occur in credit card spending, tracking, and fraud detection algorithms. Other commonly affected areas include security surveillance, retail marketing, advertising, and healthcare, where data keeps on changing over time.

Nevertheless, organizations cannot expect data to remain stable over a long period due to the steadily changing environmental conditions. Hence, machine learning models cannot succeed if they are developed for perfect world conditions. To overcome this challenge, organizations need to keep updating the underlying data according to the changes in the related components and the environmental conditions.

See also: Top Machine Learning Companies