Eighth Release of Spark for Advanced Analytics

By CIOReview | Friday, May 20, 2016
592
964
197

SAN JOSE, CA: MapR Technologies Inc announces the availability of its latest version Apache Spark 1.6.1 on MapR converged platform to help enterprises with faster application development. With cutting edge advanced features and add ons including improved performance gains with core Spark engine, machine learning pipelines, and Dataset API, Apache Spark delivers in-memory processing for big data and advanced analytics.

The latest version Spark 1.6.1 introduces a new memory manager that automatically tunes the size of different memory regions based on the workload characteristics.

Due to the lack of support for compile-time type safety with the traditional DataFrames, MapR has extended its DataFrame API called Datasets to support static typing and user functions that run directly on existing Scala or Java types. The Datasets provide better memory storage management by creating more optimal layout in memory to automatically speed up real-world big data analyses.

With the earlier versions of Spark, applications include a custom persistence code for storing pipeline externally. To overcome these issues, Spark 1.6.1 adds new features to machine learning that takes persistence beyond models, eliminating the need for custom code with faster application development.

“We have seen a significant customer adoption of Spark for building data pipelines and advanced analytics,” says Anoop Dawar, Vice President of Product Management, Spark and Hadoop, MapR Technologies.