How To Measure AI: Recall vs. Precision

How To Measure AI: Recall vs. Precision

Artificial intelligence (AI) is transforming nearly every industry. As AI becomes more widespread, you may be wondering how to gauge its effectiveness. AI is often used for pattern recognition, information retrieval, and classification. There are two metrics that are key for evaluating the performance of AI in this context: precision and recall.

Recall_Precision_V4_1

Recall refers to the fraction of relevant items that an AI search returns out of the total number of relevant items in the original population. If there are 18 relevant documents in the whole population and the search returns 9 relevant items, recall is 50%. Recall tells you how well a search finds relevant items.

Precision refers to the percentage of relevant versus irrelevant items that a search returns. If a search returns 12 items from the total population, 9 of the items are relevant, and 3 are irrelevant, the precision is 60%. Precision tells you how well a search avoids false positives. Both precision and recall are important to the success of a search.

What are good precision and recall results?
The threshold for acceptable precision and recall will vary by context. One common use of AI in the legal field is for the classification of responsive documents. A good AI search in this context is inclusive enough to catch 80-100% of all relevant documents (recall) but exclusive enough to avoid returning a high number of false positives (precision). Without high recall, a second review of the total population is necessary to catch enough relevant documents. Without high precision, the documents that the search returned have to be manually reviewed to remove false positives. Either case involves extra man hours and extra cost. While 80% is acceptable for relevance review, a search would need to return nearly perfect (99-100%) recall for situations involving sensitive information like privilege review, personally identifiable and person health information identification, or data breach.

Recall_Precision_V4_4

With a firm grasp of these metrics, you should be well equipped to understand the output of an AI product. 

To learn more about AI in the legal field and how Text IQ achieves 100% recall on sensitive information, visit www.textiq.com.