De-identification: The Risk of Un-regulation (Maria Sherlyn)

Submitted by: Maria Sherlyn

Not enough regulation exists for personal data protection. Even the largest companies have proven themselves self-disciplined in protecting their consumer data in the cases of enormous data breaches and when sharing data with others. The regulatory response has been inadequate, with only a small minority of companies required to protect personal data with de-identification and anonymization of data. While no safeguard is perfect, these data obfuscation techniques make it much more challenging to have useful information about individuals.

Companies have not learned their lessons in protecting their consumer’s data. In the past six years, 11 separate data breaches have occurred that involved over 100 million consumer records; averaging a half a billion, most all of the targeted companies are household names. In the largest breach involving Yahoo and 3 billion customers, another large breach happened the following year. These attacks demonstrated not only inadequate cybersecurity, but also specific measures to protect personal data, including encrypting data at rest and data de-identification.

The companies waiting weeks or even years before disclosing the attacks to consumers. Therefore, the perpetrators had plenty of time to do harm before the consumers could protect themselves with countermeasures, such changing passwords and monitoring their credit reports. They faced enormous loss of reputation, huge fines and lawsuits and regulatory sanctions, as well as criminal charges against their officers.

Great concern was had over the potential malicious misuse of this stolen data and all “victim” companies had their reputations stained. Equifax and Capital One tried to diminish the problem by reporting that most all the stolen accounts did not include particularly sensitive information, such as account numbers and social security numbers. Still, without this information, the culprits are able to commit identity theft and other forms of financial fraud.

*Data Server image by Panumas of Pexels*

Even more companies share sensitive production data with “trusted” third parties – such as outsourced software developers and data processing services and academic researchers; in these scenarios, to quantify unauthorized data usage is virtually impossible.

Of course, the worst cases of data sharing are when social media and other internet companies sell personal data with identities unmasked, allowing individual users to be bombarded with junk surface and email and pinpointed web advertising. The worst of the worst may be a handful of large credit card processors, such as Alliance Data, which amass not only their own data about customers, but collect public records and retail point of sale data to create comprehensive profiles of consumers, selling this information to allow very focused target marketing.

Despite all this, yet, the industry did not respond by providing enhanced personal data protection.

The impact of such breaches and unauthorized usages would be considerably less if these companies would have used data de-identification and anonymization techniques.

The considerable risks of re-identifying persons with only the attributes of birthday, sex and zip code were published almost 20 years ago by Dr. Latanya Sweeney. The National Institute of Standards and Technology (NIST) published an extensive paper NISTIR 8053 “De-identification of Personal Information” in October 2015, before seven of the 11 breaches cited above occurred.

The best-known regulations requiring data de-identification are the EU General Data Protection Regulation (GDPR – 2016) and the Health Insurance Portability and Accountability Act (HIPPA -1996). For instance, HIPAA requires entities who handle personal health information (PHI) to anonymized 18 identifiers, or attribute types of data. Other less known US Federal regulations examples apply to educational records, foodborne illness, drug safety data and aviation safety reports; California recently introduced a law to protect consumers involving these techniques.

Data de-identification and anonymization, like all security and control tools, are not fool proof and hackers can use many techniques to re-identify individuals in stolen databases. However, according the NISTIR 8053, depending on the nature of the original data set and the effectiveness of the de-identification technique chosen, at least the use of de-identification increases the need for a higher technically skilled attacker, for their available resources and for available additional data sources which the perpetrator can link with the de-identified data to re-identify it.

Bio: Maria Sherlyn is a top student as a Junior Security and Risk Analysis (Homeland Cybersecurity) major at Penn State World Campus; an idea generator who has a penchant for analysis of network vulnerabilities.

She also held been a high-level student leader, including event chair and charter officer in the Technology Club; three years on the Student Advisory Board, chairing several committees and contributing to the leadership development training program; providing student insights as one of the few student representatives to the One Penn State 2025 Taskforce, working directly with the University’s faculty and staff. Additionally, she has been a vocal advocate in combatting Human Trafficking, having spoken to an international research conference on “Traffickers use of The Dark Web”. Finally, she has honed her penetration testing skills as a passionate competitor and judge of Capture the Flag events.

Maria Sherlyn (Sept 10, 2021), De-identification: The Risk of Un-regulation
https://sites.psu.edu/mariasoyosocapuder/2021/09/10/de-identification-the-risk-of-un-regulation/

View the Original Article HERE

De-identification: The Risk of Un-regulation (Maria Sherlyn)

Submit a Comment Cancel reply