Working in the information governance space, it’s important to keep up with both trends and disagreements within the user community of professionals out in the field. And recently, it’s been refreshing to see that the discourse has become more multifaceted. One of these conversations that caught my attention recently was Ralph Losey’s e-Discovery Team® blog post “Information Governance v Search: The Battle Lines Are Redrawn.” It’s an intriguing read and thoroughly describes how technological advancements are upending the traditional information governance and records management paradigms. At the very heart of Losey’s post is that for large corporations, the falling cost of storage has made the strategy of “store and search” an effective method for managing information. However, he goes a step further; arguing that the increasing volume and complexity of data within the enterprise makes governing this data impossible. This is where I begin to – respectfully – diverge in opinion.
Losey makes a valid point that “old-school IG” personnel need to evolve and stop attempting to treat electronic data as they would paper records, but I disagree with the claim that information governance is not possible or a losing battle in today’s data landscape. While many traditional governance solutions fail to keep up with today’s massive data volumes, this is largely an architectural problem: not an intrinsic fault of information governance itself. These solutions were not built with modern information governance in mind; instead, they were built for a specific data management functions such as eDiscovery, compliance, or records management. Although these specific functions can often be “integrated” into one system and packaged as a full-service IG solution, the patchwork that holds these individual parts together cannot handle the massive and infinitely complex volumes of data in the modern enterprise. However, this does not mean that information governance is impossible. A system designed from day one with information governance as an end goal – designed from the ground up to handle these massive loads of data in a unified system – can in fact allow companies to have complete control over their enterprise information through effective information governance.
As the post suggests, an increasing number of companies believe that the concept of a “data lake” is the unified solution they have been looking for, using this pool of data as the source for a “save and search” strategy, and claiming that, “Unlike real lakes, they cannot flood.” We’d like to take it a step further; although the inherent concept of the data lake is a powerful one, there are additional issues that have not been adequately addressed. One thing that is routinely omitted from the conversation regarding data lakes is the absence a critical management layer. Let’s draw out the analogy a bit more for sake of explanation. Just as real manmade lakes and reservoirs require management – levees and water processing are necessary to ensure lakes don’t flood or become too chemically unbalanced – so does the enterprise data lake. The data lake is a manmade construct, and like any other manmade construct, it will require sustained maintenance. Information governance and management can help provide these protections, ensuring that data is clean when entering the system and accounted for throughout its lifecycle, from birth to burial. Clean, managed data leads to purer insights via analytics. We’re just now entering an era of analytics for unstructured data, but the lack of management will soon become precipitously evident as businesses gradually try to move away from the narrow-scope, stand-alone analytics platforms that currently dominate the market.
Lastly, and of particular relevance, Losey states that, “Governing information is hostile to individual privacy rights and liberties.” This is a valid argument, but isn’t the entire picture: there is another side to the coin. Information governance, by nature, increases security and gives businesses the ability to offer more data privacy rather than less; the problems of privacy rights and liberty are policy issues, not inherent issues of governance architecture. In fact, a thorough and flexible information governance strategy provides the structure needed to ensure that employees know their information is secure as possible. In today’s swirling data landscape where the NSA, hacks, and data leaks dominate the headlines, assuring employees and shareholders that their data is as secure as possible is vastly important. In fact, poor or ineffective information governance efforts by the corporation is irresponsible, if not arguably negligent.
While we could go off on a multi-volume philosophical tangent on the ethics and meaning of privacy for data that employees generate while at work, it’s not necessary in this case. Data collection and analysis of work product is not the biggest boogeyman yet, because there are already more pressing concerns. The enterprise, by nature, will ALREADY handle data that is objectively considered sensitive and in need of strong privacy controls. Social security numbers, home addresses, tax forms, bank account info, requests for medical leave, insurance information, and other data are all inherently handled by the business simply as a byproduct of employing and paying workers. These items aren’t really work products, and they make no judgment of an employee’s value or productivity to the company. But due to existing legal protections or common sense, they absolutely must be protected.
Traditionally, the approach to securing these sensitive items has been small, high-security “silos” with stricter access rights and less connectivity. But with the fluidity of information in the modern business, it’s too easy for something to accidentally exit the closed system; too easy to accidentally attach a document with sensitive content to an email or to accidentally forward addresses and personal contact info found in a job candidate’s application. A strong information governance strategy takes these issues into account, and minimizes the “weak points” that are endemic to highly interconnected systems. In essence, to protect information within the enterprise, you must control it. Unmanaged content is content that potentially violates the most basic privacy needs and expectations of workers.
In the end, increasing volumes and complexity of data should not give us an excuse to throw our arms up in defeat, accepting that modern volumes of data are too massive to be managed and that save and search is the only feasible way to effectively leverage enterprise data. Instead, companies need to realize that IG can provide the management capabilities that they have been searching for, but can only be accomplished using a unified system that won’t break apart at the seams as data volumes continue to rise. In addition, IG must continue to evolve with the data it is set to manage, becoming more thorough and integrating new technologies to improve efficiency and create a digitally-integrated IG 2.0.
The conversation around IG has become more nuanced, and that’s great; it’s a sign that the business world is beginning to look more critically at the potential value for unstructured data rather than just the immediate costs associated with its management. But I think we’ve hardly scratched the surface here. I look forward to the expansion of the discussion further as the presence of data itself continues to expand.