The Role Big Data Plays in Changing Data Archiving Strategies

By CIOReview | Monday, July 25, 2016

Data archiving is the process of moving datasets that is no longer actively used to a separate storage device for future reference as well as to advocate audits. Archiving helps organizations in retaining, managing, and leveraging their information assets effectively. Data warehouses which are regarded as the pillars of business intelligence and analytics systems often integrate data from multiple data sources in an organization to provide historical, current or even predictive analysis of the business. This cumulative data and the analytics systems that leverage it provide the technology and methodology that help organizations discover and develop meaningful insights.

Key Drivers for Archiving

The importance of archiving differs from individual to individual. For IT managers, archiving often refers to the placement of Electronically Stored Information (ESI) throughout the information’s lifecycle. Compliance officers, on the other hand use archives to contain ESI that can be stored, indexed, and controlled authentically while applying retention policies. Attorneys make use of archiving to quickly search for information to support their legal challenge. These are three instances considered as the main drivers for archiving.

Benefits of Data Archiving

By delaying hardware upgrades in production and disaster recovery environments, archiving helps companies in making the most effective use of the existing infrastructure in a controlled data growth environment. This helps in reducing the company’s total cost of ownership.

Archiving also helps in improving performance by reducing the amount of data and the number of indexes and table scans that have to be processed. It makes the task of performing periodic maintenance easier and faster apart from streamlining restoration from backups in the event of a failure for better system uptime and user productivity.

Data archiving also helps organizations in complying with data retention and purge policies while providing archives for audit or electronic discovery requests into historical data.

How Big Data is Changing Data Archiving Strategies

Archiving strategies mostly talks about the past and what we already know about our data. However, when it comes to the future it has not been able to know the events, trends and changes that will impact the information archive requirements. It has also not been successful in impacting the information archive requirements in terms of the data one needs to preserve and the tools required to make it accessible to future users and systems. This has hampered many strategies in enabling organizations contain costs, reduce risk and enable productivity. With most archive projects being construed to as ‘standalone’ having their own staff, processor, network and storage infrastructure. The emergence of big data analytics with the appearance of server-side and software-defined storage infrastructures has put the very model of standalone archive strategies on the brink of extinction. Big Data analytics, applying a set of technologies to examine ongoing trends of multiple and unrelated datasets does not see any data as archival. It views all of the data as active piece, which adds value in day-to-day business decision making. The traditional notions of archiving are being challenged by moving towards a series of discrete, server-side, direct-attached configurations connected to individual server nodes in a cluster from a centralized repository. This has forced organizations to move away from the standalone archive practice to employ the new infrastructure.


With managing information seemingly getting hard to accomplish, an effective archiving strategy can help organizations utilize their information assets effectively. It also keeps the organization safe from being victims to rising costs, compliance challenges and legal risks.