Operationalizing Machine Learning for GDPR

Operationalizing Machine Learning for GDPR

With its General Data Protection Regulation, Europe is calling on United States companies to rethink some of their core business models and data processes. And as organizations contend with life after GDPR, many of them are finding that the only thing they need to change about their data strategy is... everything. 

At the heart of GDPR, there’s a basic disagreement. In the U.S., we generally trust corporations over the government, and we will sacrifice privacy for convenience. Consider the big American idea to productize location data and build Google Maps. Europeans, however, generally trust the government over corporations. That’s why they’re willing to leave their passports at a hotel overnight, so the hotel can report on guest information to the government. It’s a concept that horrifies many Americans.

This cross-Atlantic tension is an unstable system, and an unstable system will eventually tend toward equilibrium. When the pendulum of GDPR tension finally comes to rest between Europe and the United States, the global approach to privacy will be fundamentally changed.

The first paragraph of GDPR states, “The protection of natural persons in relation to the processing of personal data is a fundamental human right.” The European regulators asserting this right have broadened its scope to include European citizens wherever their data resides. The result is that Americans are subject to sanctions under the EU regime on an ongoing basis.

The broad scope is also built into the language of the regulation. Every possible data action is governed: “basic principles for processing;” “data subjects rights” in 11 separate articles, from the “right to object” to the “right to be forgotten;” and all “transfers of personal data.” And the fines are as enormous as the scope: whichever is higher of up to 4% of the year’s profits or 20 million Euros.

It will be difficult to overstate the impacts that this regulation will have on American businesses, many of which are based on a premise that privacy can be sacrificed for convenience. Take away the premise, and the business model goes too.

Enterprise anxiety is high. In earnings calls from companies around the world, “GDPR” went from an average of 7 mentions in Q1 2017 to 177 mentions in Q1 2018, making it the top-mentioned regulation. Publishers are removing ad-related software. 80% of marketers stated concern that their tech vendors put them at risk of violating GDPR. Some Business Intelligence systems, which use personal information as fuel for analytics, are also at risk.

An unspoken agreement sparked two decades of digital growth: as consumers, we gave away our data in exchange for free and hyper-targeted services. As the famous saying goes: "When the product is free, you're the product." Now this digital bargain has broken down, and new privacy regulations are entering the scene, like the California Consumer Privacy Act (CCPA) and the NY SHIELD Act.

From reactive to proactive

There are two basic ways for an enterprise to respond to these new regulations. One is reactive, and the other is proactive.

In the reactive mode, businesses are bringing old data processes to new data challenges, like addressing Data Subject Requests at speed and at scale, and responding to data breaches in the GDPR's "long weekend" reporting period of 72 hours. They are combining manual review teams with decades-old search technology to do things that search was never built to do: reach into disparate databases and consolidate and correlate all the personal information (PI) it contains. 

While they’re building these new reactive processes, organizations are also building proactive processes. It is like rewiring the building while the electricity is still on.

In the proactive mode, organizations are seeking next-generation technology that can add an intelligence layer for understanding all this human-generated, unstructured data. These tools leverage artificial intelligence that can conform PI down to a standardized and interpretable structure, so users can both query and explore their previously dark data.

Rethinking privacy in the age of AI

One of the AI companies that is working with highly regulated enterprises to rethink privacy is Text IQ, where I serve as an advisor. AI companies like Text IQ are thriving because AI has gone from the lab to the mainstream, thanks to a combination of recent factors. The advent of big data, rapidly increasing compute power, new processing techniques, and better algorithms—these have all come together in a perfect storm, and mainstream adoption is the result.

Late last year, McKinsey surveyed 2,000 executives across 10 industries, and found that 47% of companies have embedded AI in their business processes. This represents a rapid increase in adoption: the 2017 study found that just 20% of respondents were using AI in a core part of their business.

As the privacy-convenience tension reaches for equilibrium, organizations are sorting through a lot of generalized excitement around AI, and seeking to “bring it in.” With the right capabilities, these companies embrace GDPR as an opportunity rather than as a threat, and build a new data strategy from the ground up.