CIOREVIEW >> Big Data >>

MapR Integrates SQL and JSON with Apache Drill v1.6

By CIOReview | Tuesday, April 12, 2016

SAN JOSE, CA: MapR Technologies Inc. has been providing the software industry’s only Converged Data Platform for years now. The company’s MapR Platform, renowned for enabling global real-time data applications is powered by the industry’s fastest, most reliable, secure, and open data infrastructure. The San Jose, CA based computer software company added another feather to its cap when it announced the availability of Apache Drill 1.6 as the unified SQL layer for the MapR Converged Data Platform via tighter integration with MapR-DB. MapR-DB document database is unified with ANSI SQL engine by the converged Data Platform to unlock insight via industry-standard Business Intelligence (BI) tools.

The flexibility of reporting and analytics on (JavaScript Object Notation) JSON data stored in MapR-DB tables allows faster time-to-value with insights gleaned from operational data which will benefit customers and partners alike. Version 1.6 of Apache Drill offers a new MapR-DB document database plug-in, enhanced performance and scale, and optimized Tableau and BI tool experience. “The Apache Drill project has one of the fastest release velocities in the Hadoop ecosystem with a new release nearly every month,” reports Hadoop Weekly.

“Apache Drill is a game changer for us,” said Edmon Begoli, CTO of PYA Analytics. “Most recently, we have been able to query, in less than 60 seconds, two years worth of flat PSV files of claims, billing, and clinical data from commercial and government entities, such as the Centers for Medicaid and Medicare Services. Drill has allowed us to bypass the traditional approach of ETL and data warehousing, convert flat files into efficient formats such as Parquet for improved performance, and use plain SQL against very large volumes of files.”

Drill 1.6 exhibits highlights that range from enhanced query performance and better memory management to flexible analytics on NoSQL and improved integration with visualization tools like Tableau. Analysts can perform SQL queries directly on JSON data stored in MapR-DB tables with the new MapR-DB document database plug-in. The plug-in boasts of different kinds of pushdown capabilities that provide optimal interactive experience. The numerous query planning improvements from metadata caching, partition pruning and other optimization improvements deliver better query performance on data in Hadoop and NoSQL systems.

Media companies, for instance, can analyze several terabytes of content delivery network (CDN) logs, reducing customer attrition as they don't require data transformations anymore. Therefore, can query and analyze incoming CDN files instantly. “Operational analytics on document databases such as MapR-DB is a rapidly growing use case,” says Neeraja Rentachintala, Senior Director, Product Management, MapR Technologies. “For the first time, there is a stack that allows BI developers and business analysts to store and query data in native formats without cumbersome ETL or transformation, providing end-to-end flexibility and scale,” she concludes.