Genomic Research is now Powered by Cluster Algorithms

By CIOReview | Friday, April 27, 2018
198
301
59

The same algorithm that analyzes the relationship between two individual on a social media site can also be used to study the protein interactions in various bodily functions. The graph-clustering algorithm deployed in the social media sites can prove extremely fructuous in determining the relationship between the huge networks of protein within our body. This understanding will go a long way in establishing the effectiveness of drugs on the human body and finding potential treatments for a variety of diseases. With the advanced technology in today’s era, researchers can identify millions of proteins, genes, and other cellular components at once and clustering algorithms can be applied to analyze the patterns, relationships between them, as well as other structural similarities. Although such techniques have been in practice for decades, the older tools could not accommodate the burgeoning amount of data. This is where alleviating the computing infrastructure comes to aid and offers resolutions to the vexed data scenario.

In order to effectively manage the torrent of data at disposal, scientists in the healthcare domain are currently using Markov Clustering (MCL) Algorithm that groups similar objects in a cluster and enables a holistic comparison and analysis. The reason behind the wider adoption of MCL algorithm is its free parameters, implying that scientists don’t have to set a ton of parameters for minor altercations in data. However, even algorithm as turnkey as this has its limitation. Since MCL algorithms run on a single computer node, it is a bit expensive to execute and has bigger memory footprint, which, in turn, limits the amount of data it can accommodate and cluster.

The next step for computational biologists is to mitigate the shortcomings of the cluster algorithms and come up with techniques that can support large-scale genomic research. This move is quintessential as genomic data is burgeoning at a breakneck speed and sooner there will be a demand for systems that can perform quintillion calculations per second.