Spark, Kafka, Cassandra Support Anchors BlueData EPIC's Goals

By CIOReview | Monday, February 15, 2016

FREMONT, CA: Big Data is at the foundation of all the mega trends that are happening today. So, enterprises believe Big Data Analytics to invigorate their competitiveness pertaining to their service landscape. Trudging on the path from analysis to action, BlueData recently launched EPIC (Elastic Private Instant Clusters) software platform v2.0 to build real time data pipelines with Spark Streaming, Kafka and Cassandra on Docker containers. The solution targets establishments who build and test applications for rapid-growing data analysis that demands a concoction of awareness, decisiveness and action.

A traditional data pipeline is concerned with task determination, scheduling and assignment until it finishes execution. On the contrary, real time analysis collects data streams, processes the same, takes spontaneous action and stores the data for persistent analysis. The process also necessitates quick action evaluation and model updation.

The blend of Spark Streaming, Kafka and Cassandra comes out to be one of the perfect technologies to promote data pipeline development in real time. This ternion of open source frameworks guarantees high throughput and low latency in rapid prototyping, development, testing and quality assurance. Spark Streaming, an extension of core Spark API ingests real time data from publish-subscribe messaging system Kafka, augments filtered data followed by storage in a scalable and resilient operation database Cassandra. The challenges in the aforesaid process are addressed by BlueData’s Pipeline Accelerator. Web-based Zeppelin Notebooks eases the sharing of dataset and analysis.

Highly configurable EPIC finedraws open source components in a loosely coupled fashion, which in turn facilitates organizations to cut down 75 percent investment on server and storage infrastructure. The vendor also supports complementary applications and frameworks for pipeline extension and add-on integration in case of gigantic growth of data.

The BlueData team presents two real time pipeline variants. The community edition, EPIC Lite is responsible for data assessment and personal use; whereas, the Enterprise version utilizes business intelligence, analytics and visualization to provide a multitenant infrastructure platform extensible to additive static and dynamic big data applications with Spark and Hadoop support.

The debutant demands deployment acceleration by 100 times and promises to minimize the complexity of Big Data clusters. “BlueData allows organizations to take advantage of the simplicity and elasticity of the cloud-like services model for Hadoop and Spark – while retaining their data and systems on-premises,” remarks Tony Baer, Principal Analyst, Ovum.