Hortonworks aims to advance the market adoption of Apache Hadoop and provide enterprises with a new data management solution that enables them to harness the power of big data to transform their businesses through more effective and efficient management of their valuable data assets. The cost-benefits combined with the framework's flexibility to merge commodity servers with local storage and process humongous multi-source data sets, makes Apache Hadoop the perfect solution for enterprises grappling with colossal volumes of structured and unstructured data. Today the Santa Clara, CA-based Hortonworks aspires to expand the reach of open-source Hadoop platform by making it accessible to a much wider audience.
The company has evolved into a global provider of enterprise-ready modern data architecture to support operations management, data security, and governance. Apart from having a talented engineering team, the firm had Rob Bearden bringing in his world-class entrepreneurial experience as the COO of two immensely successful open-source companies—SpringSource and JBoss. Bearden foresaw the company’s transition into becoming a publicly traded company within a short span and finally steered the firm to achieve the feat after taking the helm as the firm’s CEO, in 2014.
Envisioning Open and Connected Data Platforms
Guided by a credo centered on collaborative open-source development under the Apache Software Foundation governance model, the team at Hortonworks puts in diligent efforts to enhance the Hadoop core and create new codes to improve the open-source product. By actively collaborating with customers and ecosystem partners, Hortonworks aims to deliver their next-generation big data competency through Open and Connected Data Platforms.
The firm provides actionable intelligence to enable leaderships across academia, government, and corporate sectors with effective data management. Hortonworks offers enterprises with an out-of-the-box modern data architecture to help them manage the full lifecycle of data-in-motion and data-at-rest in any environment. For enterprises looking to stream operational data in real-time from their transactional databases into Hadoop big data platforms, Hortonworks provides an ad hoc open platform—Hortonworks Dataflow (HDF). The integrated data-source agnostic collection platform enables lightning fast streaming analytics by allowing accelerated data collection, curation, analysis, and delivery in real-time. Meanwhile, Hortonworks Data Platform (HDP)— powered by Apache YARN and Hadoop Distributed File System (HDFS)—manages data-at-rest for all data types while delivering enterprise-grade governance, security, and operations. While YARN provides HDP users with a centralized architecture to enable simultaneous processing of multiple workloads, HDFS offers scalable and foolproof storage for their big data lake. HDP also features a resourceful range of processing engines to empower customers with applications that can interact effectively with data.
Hortonworks’ 100 percent open-source approach has been modeled to help data-driven enterprises leverage the advantage delivered by a global open-source developer community. The approach also ensures zero-vendor lock-in and encompasses the delivery of best-of-the-breed interoperability within their current IT investments and infrastructure.
The combination of enterprise customers’ accelerating business transformations fueled by data and our continued financial discipline positions Hortonworks extremely well for growth
Additionally, Hortonworks holds the expertise to integrate Open and Connected Data Platforms while maximizing the value of data-in-motion generated through Internet of Anything (IoAT). “Hortonworks has always placed a tremendous emphasis on delivering our customers, world class technology innovation and proactive support,” emphasizes Bearden.
Taking Open Data Governance a Notch Higher
As a leading contributor to the open-source Hadoop project, Hortonworks has created an innovative Data Governance Initiative (DGI) to address the client needs for an open-source governance solution. DGI helps organizations with data lifecycle management, in addition to assisting them with data classification, data lineage, and security. This joint engineering initiative involving founding partners of DGI—Aetna, Merck, and Target—and Hortonworks’ technology partner SAS, is an extension to the firm’s unique open-source development model. The firm caters to the metadata and data governance needs of customers keen on moving their Hadoop-based data architecture into corporate data processing environments. The contribution of SAS has fortified the initiative by enabling the integration of SAS data management, analytics, and visualization into the HDP environment, also boosting the Apache Hadoop project in turn.
When it comes to catering to organizations across diverse lines of business, HDP enables a centralized architecture for running batch, interactive, and real-time applications simultaneously across a shared dataset. The role of HDP as a dedicated Hadoop platform includes assisting organizations in eliminating administrative complexities, augmenting developer productivity, strengthening security and data governance, and proffering proactive cluster monitoring. For instance, a leading provider of healthcare information management solutions, ZirMed, leveraged HDP to build a Hadoop cluster for Windows 2.0. The client embarked on their transition journey by deploying a small Hadoop cluster running HDP for Windows 1.3 on eight reasonably low-end machines and two master nodes. Within two weeks of deploying the cluster, ZirMed’s data team completed loading the company’s nine years worth of electronics remittance data into Hadoop.
After the successful proof-of-concept pilot project, the team at ZirMed built a production Hadoop cluster—a 29 node cluster containing 1.2 Petabytes of raw storage— running HDP for Windows 2.0. Queries that took 25 minutes on the proof-of-concept cluster took five minutes now in the actual production environment. Post the successful operation code-named ‘Analytics 3.0’, KY-based ZirMed attained the analytics capabilities to envision the next generation of health management.
"Hortonworks has always placed a tremendous emphasis on delivering our customers, world class technology innovation and proactive support"
Reaching Out to the Apache Community
Egged on by a very future-centric idealogy—“open source spurs innovation”—Hortonworks continues to contribute through community-based open innovative efforts. With a dedicated engineering squad that abounds in open-source talent, Hortonworks boasts of the largest concentration of Apache Hadoop committers. Hortonworks’ Hadoop engineering team has contributed notable amounts of source code to core Apache Hadoop projects including HDFS. Additionally, the firm has a global partner support program—Partnerworks—in place to assist partners with sales, implementation, and innovation and allows them to deliver a holistic enterprise data architecture backed by best-of-the breed technologies. Last year, the firm announced the launch of their turnkey, Hadoop-powered Business Intelligence (BI) solution for enterprise data warehouse (EDW) optimization. The EDW optimization solution helps customers to overcome challenges associated with risks and costs of introducing new solutions into a legacy infrastructure, while extending the value of their existing EDW investments. According to Scott Gnau, CTO of Hortonworks, “I predict we will quickly move from post-event and even real-time to pre-emptive analytics that can drive transactions instead of just modifying or optimizing them.” He adds, “This will have a transformative impact on the ability of a data-centric business to identify new revenue streams, save costs, and improve their customer intimacy.”
SmartSense for Proactive Support
Taking into account, the critical nature of data applications and the pace at which big data is generated, Hortonworks enables organizations to proactively address and resolve their cluster issues. With a unique model in place to deliver Hadoop support that entails proactive monitoring and actionable recommendations, the firm ensures that customers get serviced before they actually experience an issue. Hortonworks SmartSense is a comprehensive compilation of tools and services that help enterprise operations teams capture diagnostic information for resolving issues before they escalate. By offering actionable recommendations customized for individual clusters and workloads, SmartSense enhances cluster performance, security, and operations for each individual cluster node.
Hortonworks’ commitment to promote and establish Hadoop as the foundational technology of the current enterprise data architecture is evident in its commitment to ensure that all core code in the Hadoop framework remains open-source. The firm’s development focus is directed toward addressing the big data use cases of data-driven organizations. The leadership at Hortonworks foresees a future where secure unfettered data access and a seamless movement among technologies, applications, and platforms will help enterprises outperform their competitors. In a forward-looking statement Bearden emphasizes, “The combination of enterprise customers’ accelerating business transformations fueled by data and our continued financial discipline positions Hortonworks extremely well for growth and success in 2017.”