3 Critical Functions AI Performs That Predictive Coding Can’t

3 Critical Functions AI Performs That Predictive Coding Can’t

“Machines should be able to sift through the noise, help us overcome our biases…not [provide] shallow predictions,” says David Ferrucci (Machine Learning vs Machine Understanding, 2020).  Ferrucci—the IBM Watson team founder made famous for besting the Jeopardy! champions in 2011—goes on to point out that another serious limitation to traditional machine learning (ML) approaches is that they provide their right or wrong results without explanation.

This award-winning artificial intelligence (AI) researcher may have never heard of Technology-Assisted Review (TAR), TAR 2.0, privilege review, or privilege review logs. But he could easily be describing why TAR (or predictive coding) is ill-suited to assisting attorneys with the most laborious and costly step in the document review process: review for privilege and other sensitive information and the creation of privilege logs.

This is not to disparage the efficacy of predictive coding techniques as a tool for sorting and prioritizing review of relevant (R) and not relevant (NR) documents. “But when it comes to privilege, it’s a different story,” notes Reed Smith LLP Partner and Records & E-Discovery Group Chair, David Cohen. “Privilege is much more nuanced. It has a lot more to do with roles and relationships…And I think that virtually all of the folks that make the predictive coding software will tell you that it doesn’t work very well for privilege” (A Better Way for Harnessing AI for Accuracy & Productivity, 2021). 

Ultimately, to consider the use of AI for privilege review, three significant hurdles must be overcome: 1) superior recall rates, 2) human relationship-based analytics, and 3) explicability. 

Using AI to Solve the 3 Core Challenges of Privilege Review 

Using mathematical models of text occurrences in documents, and reliant on the structured data provided by sample coding to get at the “right answer,” predictive coding techniques like TAR, cannot overcome these privilege review hurdles. 

The solution requires something deeper. And advanced AI employing deep learning (DL) that leverages multiple layers of neural networks (patterned on the human brain) has proven capable of meeting this challenge. Importantly, “Deep learning models are capable of unsupervised learning. They can detect previously undetected features or patterns in data that aren't labeled…” (IBM, 2020, emphasis added). TAR requires the “labeling” provided through the supervised learning regimens (i.e., training).

Importantly, as AI does not require supervised training, it removes bias introduced through these trainer interactions: Continuous learning is built into the AI brain. That said, “Deep machine learning can leverage labeled datasets to inform its algorithm,” but it doesn’t require them and can “train itself” (Kavlakoglu, 2020). 

Challenge 1: Near-perfect recall rates

While we are past arguing the “myth of perfection” regarding eyes-on-paper document review versus predictive coding, privilege review is another matter entirely. As Munger Tolles & Olson LLP eDiscovery Counsel Bobby Malhotra puts it, “In a responsiveness review 80% recall rate may be acceptable, reasonable, and defensible…When it comes to privilege the margin of error is razor-thin” (Trust, But Verify, 2021). As such, priv review will always require both strong quality control regimens and review by expert counsel. 

The stakes are far higher with privilege than with relevance reviews and include not only the disclosure of privileged attorney communications, advice, and work product, but possibly the exposure of corporate IP, as well as PII, and “sensitive personal data” (as recently defined by California’s CPRA).

Given the reliance on text-based analysis, it is unlikely that predictive coding techniques would even approach 80% accuracy in a potentially privileged review. Privilege is far too nuanced. And much of that nuance is a function not of text but of roles and relationships. 

Challenge 2: Analysis based on roles & relationships, not text

As predictive coding algorithms are designed to analyze text (what is being said) it is a poor fit for privilege, which is very much a function of who is saying it. A model that grasps how attorneys speak and comprehends roles and relationships solves for this challenge. 

Consider an exchange between Bill and John (neither are attorneys) that reads, ‘Mary spoke with Dave and we are good to go.’ Is this potentially privileged? It may be so if Dave or Mary is counsel. Not so, if Mary and Bill work with Dave and they’re talking about going to lunch. Social context analysis is capable of comprehending individual roles, their relationships, and how people typically interact and thusly recognizes the potential privilege in this exchange. Text-based analytics, including traditional privilege screens using keywords, will not. 

Safeguarding PII and sensitive personal information too is a human-centric issue and not a text-based challenge. Regular expression search (RegEx) using pattern matching can fail to identify PII or properly associate the information with the correct individual. 

Challenge 3: Explicability

TAR has come a long way since Judge John M. Facciola opined that search methodology in producing e-discovery must be scrutinized under Rule 702 and as such ordered Daubert hearings (Deutchman, 2008). So, while it is still required to properly document all e-discovery procedures, defending the validity using algorithms or why the particular solution or technique is chosen produces the results it does, is not at issue. 

With privilege, however, results must be defended via their reason coding in the privilege log. Counsel cannot point to an algorithmic black box and say, ‘these documents are privileged because it told me so.’ Explicability is a core requirement of privilege review. 

While predictive coding methods cannot “explain” their results in comprehensible terms (apart from an expert detailing the statistical methodology), advanced AI can, inclusive of scoring results when purpose-built to do so. This also helps to streamline the otherwise arduous and time-consuming privilege log creation as well.

Proof

Reed Smith’s, David Cohen piloted this AI approach developed by Text IQ integrated with Relativity®—testing against a traditional potentially-privileged review—and discussed the results during the March 16 2021 EDRM webinar, A Better Way for Privilege Review: Harnessing AI for Accuracy & Productivity. The results are significant.

“If we only relied on a privileged screen, we would have inadvertently produced to the other side 50 responsive, privileged documents,” says David. He notes: 

    1. “We found that more than 99.9% of the privileged documents were actually identified by Text IQ. There were two true misses in terms of documents that got through Text IQ. After that, they adjusted the platform so that it would have caught in those documents as well.”

    2. “On the magnitude of 25 times more privileged documents would have been produced using the traditional privileged screen.”

    3. “Not only could it identify the privileged documents, but it also prepared draft privileged descriptions for each of the documents withheld as privileged [and] those descriptions were actually much better than our boilerplate in terms of describing the privileged documents. So, it reduces editing time and cost.”

Of course, given the nature of privilege, senior attorneys still (and likely always will) review the results. But as David notes, cost savings were achieved through fewer documents to review and less editing of the privilege log. The risk was reduced in terms of identifying privileged documents that the typical privilege screen would miss. 

Ultimately, true AI purpose-built to analyze social context (not mathematically represent text occurrence in documents) can move beyond the four corners of the document and accomplish what supervised machine learning (i.e., predictive coding) cannot: performing first-pass potentially privilege review with near-perfect recall rates, results scoring, and even, explicability.

You can learn more about how AI transforms privilege review in our guide, How Enterprises Can Leverage AI to Achieve 75% Time Savings in Privilege Review