Some of us more seasoned attorneys remember spending long days in conference rooms sifting through boxes of musty paper documents. Because of the explosion of e-mail and other electronically stored information (“ESI”), practicing attorneys now understand “discovery” to include e-discovery, and “document” to include information stored and exchanged in electronic format on servers and personal computers. The sheer volume of ESI in today’s practice calls into question the cost, accuracy and even feasibility of in-person document review. For many years now, attorneys have had to consider how they can review all of the available documents to comply with discovery requests without sacrificing the accuracy and consistency of the process.
Some years ago, attorneys began applying Boolean “keyword” searches as a way to prioritize the pertinent ESI for review; however, these keywords still produced voluminous results. In response, a number of companies have developed advanced technology software programs, including predictive coding, to electronically identify and cull potentially useful ESI. Beyond keyword searches, these programs apply advanced algorithms to help identify and categorize the electronic documents that need to be reviewed based on relevancy, privilege or issue.
What is Predictive Coding?
As Wallis Hampton of Skadden Arps explains in Predictive Coding: It’s Here to Stay, predictive coding refers to using a software program to identify documents that are relevant to a particular case or issue. Seemingly similar to an email spam filter, it involves a machine learning process and a combination of different algorithmic tools. To identify documents, predictive coding uses techniques and tools such as: concept, contextual, and metadata searches; probability theory; relevance ranking; clustering; and, sorting and filtering by issue.
Predictive coding involves training the software program to identify a set of relevant documents from a broader set of potentially relevant documents. Scott Kane in Predictive Coding Technology Continues to Gain Traction in Legal Document Review provides an overview of the training and coding process. In order to train the software, a knowledgeable attorney who is familiar with the case codes documents in the broad set based on relevance, privilege or a specific issue as it pertains to the case at hand. Then, the software program analyzes the coded documents and applies algorithms in order to learn to identify relevant documents.
After this initial coding, the software generates a subsequent set of documents for review. At this point, the attorney can accept or reject the software’s classifications, allowing it to incorporate the feedback into future coding decisions. Although this training can take several cycles, the goal is to have the program agree with the attorney’s coding for a predetermined percentage of the documents. Once the training process is complete, the program applies its coding across the entire set of documents and ranks the results based on perceived relevance to the discovery requests.
Predictive Coding is not Just for Big Law
Although the impetus to develop technology-based e-discovery software may have originated in big law, it also affects solo and small firm practitioners. Within the last two years, federal and state courts around the country have issued decisions endorsing the use of predictive coding and accepting it as the norm. Small firm practitioners or organizations may object to discovery requests based on undue burden. Increasingly, however, courts are overruling such objections. In addition, the time and cost savings that predictive coding software can provide to attorneys and clients is invaluable.
Courts Rule Discovery Requests are not Overly Burdensome
Courts are pointing to the availability of predictive coding in overruling undue burden objections. Adam Losey of Foley & Lardner in Lawyers: Objecting to Predictive Coding Is Futile highlights many court decisions around the country that have endorsed the use of predictive coding. He concludes that, “The majority of our judiciary has proved aware, at least conceptually, of predictive coding and potential application in litigation.”
For example, two recent decisions issued in the Southern and Northern Districts of New York rejected undue burden objections to subpoenas issued to third parties. In their decisions, both Judge Lewis Kaplan and Magistrate Judge Randolph Treece recognized that the availability of predictive coding could reduce the burden and effort required to comply.
In his opinion, Magistrate Judge Treece specifically noted that, “With the advent of software, predictive coding, spreadsheets, and similar advances, the time and cost to produce large reams of documents can be dramatically reduced . . . the Court is more convinced than ever that [the subpoena] is not . . . an overwhelming and incomprehensible burden.”
Time and Cost Savings
Predictive coding can also save attorneys’ time and clients’ money. With predictive coding, documents can be coded, screened and prioritized faster than in-person review, which can sometimes take months to complete. The availability of predictive coding also reduces the number of attorneys needed to conduct document review. In turn, this technology may level the playing field for solo or small firm practitioners who historically have not been able to hire legions of lawyers to conduct document review. Additionally, since the predictive coding program relies on the judgment of the senior attorney who trains it, the outcomes are likely to be more consistent and accurate than if an attorney hires multiple individuals who are less familiar with the facts to review the documents.
In turn, predictive coding provides cost savings to clients. A timely and accurate discovery screening process allows clients’ funds to be spent on attorneys’ time reviewing the documents that are actually relevant and useful to the litigation. Judge Kaplan noted these time and cost savings. “Predictive coding is an automated method that credible sources say has been demonstrated to result in more accurate searches at a fraction of the cost of human reviewers.”
Beyond Predictive Coding: Artificial Intelligence
Predictive coding may just be the beginning of the possible uses of technology software in the legal industry. Writing in Law Technology News, Tam Harbert reports that IBM has started developing and commercializing Watson, its “cognitive platform” for the legal industry, specifically targeting large law firms, legal service providers, and corporate law departments.
According to Rich Holada, vice president of transformations for the legal industry in the IBM Watson Group, “Watson [would] be ‘trained in the legal corpus,’ essentially by reading an unstructured document set of legal knowledge.” Three possible uses for Watson’s technology within the legal industry include: 1) automated associate; 2) legal research; and, 3) work focusing on the process of law, using quantitative predictive and descriptive analytics.
Although there is no word on when (or if) Watson will make its legal debut, the use of technology in the legal field seems inevitable. With the emergence, and apparent judicial acceptance, of predictive coding and the certainty of greater technological advancements on the horizon, attorneys will be well-served to understand and embrace these changes as they come. Their ability to effectively serve clients depends on it.
The Commission’s intern Lauren McGee of Chicago Kent Law School contributed to this post.