Imagine a typical business success story: a mid-sized company built from the ground up starts by making an impressive mark on the industry. Over the years, the business improves operations while profits steadily increase. With only a few and relatively minor setbacks, the company flourishes, and business continues to grow. Concurrent with this growth are some important correlated changes. Specifically, in order to account for an expanding roster of clients, more and more database storage is required. And, in an effort to stay current technologically, new enterprise applications are brought on board while others are retired. Finally, discussions of compliance and compatibility with these new information technology components become an important part of infrastructure and development.

Everyone celebrates growth. Businesses need it. And, overall, commerce relies upon it. After all, growth is perhaps the primary goal of any business. What organization wouldn’t welcome an offer of expanded operations, broader market influence, and substantial increases among its market base? With such growth, though, businesses also see a requisite amount of change. Not only the faces of new clients and geographical environments, but shifts in technological platforms and IT infrastructure are expected parts of the success story of every business.

All firms are in the process of constantly acquiring new clients and business partners. And along with this regular accumulation of data, the question arises: where to put it for long-term storage? What happens to important archival documents when a company needs to significantly upgrade their IT infrastructure? One cannot simply discard such data—even from clients the company no longer works with. What happens, for instance, when such information is needed for litigation or an audit?

This is where structured archiving comes in. Structured archiving refers to the ability to store and catalog application data in secondary databases or standalone files for long-term retention, often by using less costly storage [1]. Such enterprise-level archival processes have a number of specific applications and related benefits that this blog post will explore in-depth. Taken together, these applications account for some of the various reasons for the increased import structured archiving has had for businesses across the board via management of data growth, application decommissioning, and compliance with enterprise IT infrastructure and legal.

Data Growth Management

Increases in data are a common part of the development of every business. Sales data, customer records, employee and HR information, client contracts, patents, project data, data from Enterprise Resource Planning (ERP) systems, and analytics are just a few of the many points of information that are crucial to businesses today. As the sheer volume of data increases, some have suggested an updated framework for conceiving of the element that many consider to be the bedrock of modern commerce.

This framework is known, of course, as big data. Big data refers to extremely large sets of data that can be used to provide predictive behavioral analysis and other kinds of analytics and metrics. The regular use of such large volumes of data has become increasingly important to businesses. Big data can improve internal communications, customer relations and experiences while providing better overall market intelligence[2]. And the reliance upon big data will only increase. As Forbes recently noted, “It doesn’t matter what field you operate in or the size of your business; as data collection, analysis, and interpretation become more readily accessible, they will have an impact on every business in several important ways”[3]. Big data will only become a bigger part of businesses everywhere.

How, then, does one approach archiving such extensive sets of data as a business matures? What happens when a business needs to store different kinds of data, for instance, or structured data versus document-centric records? Thankfully, there are different archival approaches suited for different purposes. The first, and perhaps most relevant archiving method, is known as full schema archiving. Full schema archiving takes the complex relational database format of structured data and transforms it to a structure that fits with the archival system to be used[4]. This means that the data remains “searchable” using existing business queries, which may deliver either structured or unstructured records. An important part of this archival method is the archiving of the actual meaning of the data—just how is the data structured?—not just the data itself.[5]

With table or partial schema archiving, only a specific table or part of the full database is used. Partial schema archiving can be a result of partitioning, which limits the size of what would otherwise be an inflated database with unnecessary or redundant entries. The example Paragon gives: “In supply chain manufacturing you might have an instrument calibration and maintenance system—Maximo, ProCal and other type systems. Perhaps the record is defined in terms of plant floor equipment and work orders that can be extracted from the whole and defined in a partial schema or cut of the database whole[6].” As long as the records and the data models are well-defined, this partial schema method here will suffice.

The last archiving method we’ll look at here is print streaming or report-based archiving. This technique works by archiving only the kinds of reports that are run mostly typically by a business. These can be common reports run by managers or technical queries for audits or litigation purposes.[7] A benefit of this archival approach is its potentially drastic reduction in the resulting storage size. Reducing a large database down, on the one hand, to its essential and/or commonly used elements can dramatically shrink an otherwise unwieldy dataset. On the other hand, though, this archival technique can leave a substantial amount of data out of the mix, as often critical data consists of entries not typically if ever queried in the past. As always, such techniques must be considered alongside a cost savings and benefits analysis to determine which is right for a specific business and a given application.

Application Decommissioning

During the lifespan of nearly every business, there will arise the need to retire certain applications as programs are superseded by comprehensive upgrades and other important changes to the IT environment.This process, known as application decommissioning, is a notable part of the structured archiving process. An example would be when a company adopts a new Enterprise Resource Planning (ERP) solution. Typically, with such an upgrade the older system would be completely replaced. However, there remains a storehouse of important data accumulated through the legacy ERP system.[8] Application decommissioning deals with how this data is preserved. And the process has important consequences on the life and integrity of business data, hence it is crucial for the overall goals of a business.

There are an important set of questions that any business should ask itself when considering the process of decommissioning one or more legacy applications. The first of course is: precisely which applications should be retired and which should remain in operation?[9] The answer to this question involves a number of variables and can often appear as points of controversy for companies that have invested years of time and energy learning an IT platform. At the end of the day, though, a cost-benefit analysis can resolve the set of reservations concerning the prospect of any specific piece of legacy software. Such an analysis will reveal specifically how much will be gained by implementing the new system versus the losses of having to retire the current application or environment.

 Another important consideration regarding application decommissioning is preservation of the actual data accumulated through the usage of the application in question. For many platforms, not all of the data created will need to be stored. For instance, many applications, in addition to actual document data and database entries, will create preference documents along with various technical log files and operating system records. These data, for the most part, do not need to be stored. Most businesses will want to separate out the intentionally created documents—especially those deemed important data sources—from the more trivial data during the process of archiving. This way, once the legacy system is replaced by a newer version, the business will still have access to the legacy system’s most pertinent data.

Structured archiving can offer a measured solution to many of the problems that arise during application decommissioning. Of course, some of the problems may require intuitive responses if a full plan of action has not been drafted in advance. For instance, permissions for the archived data will need to be sorted out, along with the duration of the data storage. Additionally, a business may need to consider the way specific queries appear when accessing data from legacy applications. Thankfully, the majority of these issues can be decided before the application decommissioning process. By carefully considering these and other issues ahead of time, the structured archiving process will be a smooth and productive one with lasting impact.

Compliance with Enterprise IT Infrastructure and Legal

This last section will cover some of the issues that arise pertaining to compatibility with IT infrastructure and structured archiving along with compliance with emerging legal requirements. There are several reasons to be concerned with compliance and compatibility in this context. In fact, IT infrastructure compatibility arises as both a reason to retire legacy applications and emerges as an important consequence once such changes are put into place. In addition to software compatibility issues, similar problems arise from changes, for instance, in legislation that affects the IT industry. In any structured archival process, therefore, these issues will inevitably play a key role in an effective solution.

Take a common example of IT infrastructure compatibility: a new Customer Relationship Management (CRM) system is implemented. As a result of this change, many existing CRM documents will no longer be read by the new system. Therefore, a comprehensive structured archival process must be initiated in order to account for this data. This is, indeed, another way of thinking about decommissioning legacy applications. However, in this instance, we’re considering the consequences of an upgrade as requiring structured archiving due to compatibility problems instead of a broader plan to phase out legacy applications. Either way one views it, the process should include an archival approach that accounts for the best value in terms of the data will need to be chosen.

Regulatory compliance is an area that relates closely to the problem of infrastructure compatibility. Indeed, concerns about compliance are only due to increase as more applications are added to an IT infrastructure, but also relate to shifts in legal structures and cultural norms. As Jeroen van Rotterdam writes, “Regulatory compliance requirements are increasing and undergoing rapid change with new legislation,” going on to cite legislation such as the Markets in Financial Instruments Directive (MiFID), the “right to be forgotten,” and new legislation around privacy issues.[10] Taken together such changes can result in the same kinds of compatibility issues that arise from software and version problems.

We’ve now seen three different but related contexts in which structured archiving plays a crucial role for businesses today. Whether a company is managing significant data growth, decommissioning legacy applications, or working to maintain compliance with IT infrastructure or legal mandates, knowing and understanding effective approaches to structured archiving is invaluable for every business in operation today.