Revisiting Automatic Redaction: HIPAA Redactions

Text IQ Series | Revisiting Automatic Redaction: HIPAA Redactions Can Be a Tough Pill to Swallow

Life Sciences companies - and their legal representation - feel the burden of redacting sensitive information such as PII and PHI to comply with HIPAA regulations during clinical trials, adverse report filings, litigation, and more. We spoke with Leeanne Mancari, Litigator and Co-Chair of eDiscovery and Information Management Platform at DLA Piper, on the pain points she faces while representing clients in the Life Sciences sector and potential solutions to streamline the discovery process. 

A Quick Intro to HIPAA

Since 1996, the Health Insurance Portability and Accountability Act (HIPAA) has defined patient privacy and security rights and rules for handling personally identifiable information (PII) and personal health information (PHI) for all life sciences organizations and healthcare professionals to abide by. As an individual, you’ve likely signed a HIPAA consent document when visiting a doctor’s office. For organizations, it details what information is protected, how it must be handled, and how long it must be kept on file. This adds up to a lot of time, money, and effort for even a small doctor’s office, let alone a hospital system, or research facility conducting clinical trials of hundreds of thousands of patients.

So what constitutes PII and PHI according to HIPAA? The HIPAA Privacy Rule has a total of 18 identifiers ranging from the obvious like name, social security number, and account numbers, to the very nuanced such as all elements of a birth date except for the year, unless the individual is over 89, then the year is included too. Additional details on the 18 HIPAA identifiers can be found here. The extent and the specificity of this list can make the protection and security of the PII and PHI extremely challenging. 

In addition to these challenges, life sciences organizations also face steep fines for any HIPAA violations. Violations, usually discovered through complaints or HIPAA audits, are enforced by the US Department of Health and Human Services and range from civil penalties of $100 to $50,000 per violation to criminal penalties of $50,000 to $250,000 and 1 to 10 years imprisonment. A HIPAA violation can often affect several thousand patients and continue for several years prior to its discovery. The largest fine to date affected 115,143 patients and Memorial Healthcare Systems was charged with a fine of $5.5 million in 2017.

The Pain Points

Preferring to keep their reputation and bank accounts intact, organizations aim to comply with HIPAA guidelines, and, as you can imagine, many legal issues can arise from the use of HIPAA data. Beyond instituting security, privacy, and training measures, companies in the life sciences industry must also follow HIPAA rules during clinical trials, adverse reaction filings with the Food & Drug Administration (FDA), litigation, among other legal and regulatory proceedings. Often, this compliance is in the form of redaction, or “blacking out” PII and PHI from documents. 

A Very Manual Process

From a legal perspective, an enormous part of the discovery process for life sciences organizations is spent redacting PII and PHI information. It may seem like a simple step, but it is often at the center of health care litigation and HIPAA violations, making accuracy crucial. 

In her role as the legal representation for many corporations within the life sciences sector at DLA Piper, Leeanne is faced with these HIPAA regulations regularly and knows this process intimately. She explained that going through the often thousands of pages within sensitive documents is often a very manual process. In some cases, she is able to file for a protective order, or HIPAA safe harbor, which would allow companies to minimize or skip the redaction process. However, skipping HIPAA redactions is a very rare situation given the sensitivity of the data. While technology has stepped in to improve the process by automating portions of discovery, it is nowhere near 100% automatic as the current software has many limitations. It is common practice to utilize offshore teams to go through documents manually to redact sensitive information by hand.  

Time is of the Essence

Leeanne generally employs a mix of software and manual labor during the discovery process. She recounted a recent discovery experience, exclaiming that, “the amount of time that goes into redactions is a huge pain point!” This particular discovery job was so large that even with a team of over 100 working offshore for months, meeting deadlines was extremely difficult. 

Accuracy is Key

Part of the reason the redaction process takes so long is that accuracy is crucial. Due to the very specific and nuanced nature of the HIPAA data, the time it takes to review every document to ensure that the correct information and only that information is redacted can be very tedious. This need for exact accuracy is also why the process has historically been very difficult to automate completely. Software can take a first-pass at the documents, but then human review is needed to complete the process. And even after software and human review, rework is often still necessary in order to get it right. 


With reputations built on accuracy, safety, and trust, most life sciences organizations take pride in complying with HIPAA regulations to not only ensure the privacy of patients and clinical trial candidates but also to ensure the security of their patient data and proprietary research. If this sensitive information is not protected, or if documents are not properly redacted, then these organizations run the risk of data breaches, violations, and reputation-damaging leaks. Each risk comes with its own set of unfavorable outcomes. This is why time and care are taken to prevent damaging breaches through proper compliance and accurate redaction. 


The frequency of breaches, violations, and leaks enforces the need and importance of correct handling of these documents. Data breaches are no longer just once in a blue moon, in 2019 they rose 17% with healthcare being the most targeted sector. PII was targeted in 98% of these breaches. HIPAA violations were also on the rise in 2019.  The need for redaction is indisputable according to these stats and the need for automation of redaction is evident in stories like Leeanne’s. 

Enter artificial intelligence. 

HIPAA Redaction Solution 

With the amount of time and effort given and the amount of accuracy needed to ensure documents containing HIPAA data are in compliance and kept secure, the current process is ripe for improvement. “Anywhere we can reduce that burdensome process would be extremely helpful,” stated Leeanne of readying documents for legal proceedings. “The more we can accurately automate, the better.” 

Advances in artificial intelligence (AI) now give us the ability to almost fully automate this process. The result is a more efficient, accurate, and secure process than ever before. Software that can accurately auto-redact all 18 HIPAA identifiers, regardless of complexity, and nothing more, in a fraction of the time is now available from Text IQ with their new Auto-Redact product. “Hearing things like auto-redactions gets me very excited. Even if it's just partial auto-redaction, even that's a huge help,” stated Leeanne. 

For more information on automatic redaction with AI, and how it can help your organization, please reach out