Gone are the days of searching through filing cabinets to find relevant documentation for legal discovery. Digitalization has revolutionized enterprise operations, dramatically complicating the discovery process. Consequently, modern organizations are left to figure out how to manage massive data volumes, conduct enterprise-wide searches, and abide by privacy requirements.
Data volumes are growing at a rapid rate
With over 2.5 quintillion bytes of information created each day, the amount of data organizations are responsible for combing through during eDiscovery is rapidly growing. However, around 60% of enterprise data does not need to be retained, as organizations, ideally, should only keep data that has business value, legal requirement, or institutional history. Without governance policies in place to police which documents are stored and which are discarded, organizations end up with massive amounts of unregulated data that hold little to no value. This banal data is typically stored for one of two reasons: either the organization does not have the ability to filter ecause they abide by the “better safe than sorry” philosophy. Regardless of why, this rapid expansion of stored data means that organizations need to be able to search through their ever-growing databases.
A way to reduce the burden of eDiscovery is by cleaning up enterprise databases. For example, organizations can use lifecycle policies to delete data as they age out of regulatory requirements. Another solution would be to aim for data singularity, getting rid of redundant, obsolete, and trivial (ROT) information. Not only does this ebb overall data volume, but it also prevents discovery teams from repeatedly reviewing the same documentation. In general, any defensible method of deleting data from company storage will greatly assist the eDiscovery process by reducing the burden of search and review.
Searching across data formats and repositories can prove challenging
From these massive repositories, organizations have to meticulously search for all relevant information pertinent to the case. Not only is the size of the search challenging, processing a variety of file types across numerous data repositories proves difficult
Expanding on the former, relevant information may be found in file shares, email, texts, and messages in collaboration apps. These various data formats pose a challenge because the eDiscovery platform will have to integrate and transform each into one format for searchability. Consequently, modern eDiscovery platforms need to have countless APIs to connect these data sources. Failing to do so would result in potentially important information being excluded from legal discovery, which, if discovered by the opposition, results in your immediate defeat.
In terms of searching across multiple repositories, eDiscovery requires all data be brought forth—regardless of where it lays. To do so, organizations need to either virtually merge data repositories or perform repetitive searches. Virtually merging information is ideal because it is easier on the searcher, reduces the risk of missing information, and is considerably faster than independent searches. Thus, consideration should go into not only what information can be searched but also the methodology used for searching.
Once relevant findings are isolated, sensitive and privileged information should be filtered
Further complexing the search process is that not every ‘relevant’ document should be brought forward for presentation—at least not without protective measures. Notably, some documents contain personally identifiable information (PII) or are deemed privileged; both, if presented or shared with opposing counsel, would result in penalty.
Privileged and sensitive information can be manually excluded or censored during the review phase, but there are also strategies to isolate them in mass. One such approach to data privacy would be to utilize pattern recognition software; for example, an organization could choose to flag any document in eDiscovery which uses a 3-2-4 (###-##-####) digit combination, as it is likely to be a social security number. Additional searches could also be conducted within already identified relevant information to find keywords frequently associated with sensitive or privileged documents. Once isolated, reviewers can either encrypt, redact, or remove these documents as needed.
These aforementioned challenges to eDiscovery are not new, nor will they go away soon: data volumes will continue to grow, searches will get more complex, and new regulations will come into play. However, technological advances are developing to meet these challenges. For our last post in this series, we will look at how ZL for eDiscovery addresses these challenges and assists users throughout their eDiscovery process.
Follow the rest of our eDiscovery blog series:
- Modern eDiscovery challenges (this post)