Google Cloud Dataflow for Large Scale Data Processing

By CIOReview | Tuesday, May 10, 2016
Andrew C. Oliver, President and Founder, Mammoth Data

Andrew C. Oliver, President and Founder, Mammoth Data

FREMONT, CA: Having conducted its comprehensive cloud solution benchmark study, a big data consulting firm, Mammoth Data recently presented its findings that compare Google Cloud Dataflow and Apache Spark. “Mammoth Data found that Cloud Dataflow outperformed Apache Spark, underscoring our commitment to balance performance, simplicity and scalability for our customers," said Eric Schmidt, product manager for Google Cloud Dataflow. "Google Cloud Platform data processing and analytics services are aimed at removing the implementation complexity and operational burden found in traditional Big Data technologies.

Specialized in combining modern big data technologies with high-level strategies, Mammoth Data creates systems that organize and transform unstructured data into highly valuable business intelligence. “We go beyond developing dashboards and reports and help businesses with improvised decision making,” said Andrew. C. Oliver, President and Founder, Mammoth Data. 

Mammoth Data teamed up with Google and compared Google Cloud Dataflow with renowned alternatives such as Apache Spark to provide comprehensible metrics after witnessing the lack of availability of comparable technologies. Google Cloud Dataflow delivers a unified model for batch and streaming analysis including on-demand resource allocation, full lifecycle resource management, and auto-scaling of resources. The study further suggested that Google Cloud Dataflow deliver greater performance and integrates easily with other services. 

"When Google asked us to compare Dataflow to other Big Data offerings, we knew this would be an exciting project," said Oliver. "We were impressed by Dataflow's performance, and think it is a great fit for large-scale ETL or data analysis workloads. With the Dataflow API now part of the Apache Software Foundation as Apache Beam, we expect the technology to become a key component of the Big Data ecosystem."