A new study completed at UC Berkeley has revealed that the current US legal framework is ill-prepared to legislate on how Artificial Intelligence impacts people’s digital privacy.
The study focusses on how AI is able to use vast repositories of data in order to identify individuals and their personal health data. According to lead researcher, Anil Aswani, AI can use step data collected from activity trackers, smartphones, and smartwatches, and cross-reference it against demographic data in order to identify individuals.
During the study, Berkeley researchers used data mined from 15,000 US individuals to successfully demonstrate that current laws and regulations do not adequately protect people’s private health data. The research, which was published on 21 December last year in the JAMA Network Open journal, reveals that privacy standards set out in the current Health Insurance Portability and Accountability Act legislation (HIPAA) are in urgent need of reassessment if people’s health data is to be properly protected.
Reattribution of personal data
A major finding from the study involves the reattribution of anonymized or pseudonymized data. According to Aswani, stripping all the identifying information from health-related datasets does not properly protect individuals. This is because it is possible for firms to use AI to reattribute data that was previously anonymized. Aswani explains:
"HIPAA regulations make your health care private, but they don't cover as much as you think. Many groups, like tech companies, are not covered by HIPAA, and only very specific pieces of information are not allowed to be shared by current HIPAA rules. There are companies buying health data. It's supposed to be anonymous data, but their whole business model is to find a way to attach names to this data and sell it."
Aswani has described how firms like Facebook make it their business to put sensitive data back together. Unfortunately, current US laws do not stop firms from reattributing previously scrubbed data, which is putting people’s private health data at risk:
"In principle, you could imagine Facebook gathering step data from the app on your smartphone, then buying health care data from another company and matching the two. Now they would have health care data that's matched to names, and they could either start selling advertising based on that or they could sell the data to others."
The implications are obvious, for people struggling with potential health problems this health data can lead to discrimination. Any health data that can be successfully attributed to an individual can be used by health insurers, for example, in their decision-making process. In the case of step data, a more sedentary lifestyle (something that health insurers shouldn’t automatically know about) could lead to higher premiums.
Easy access
The UC Berkeley study demonstrates that a rise in the efficacy of AI will greatly increase the private sector’s ability to gather health-related data about individuals. The researchers believe this will inevitably create the temptation for firms to use the data in unethical or clandestine ways.
As AI improves, individuals could find their health data being turned against them by employers, mortgage lenders, credit card companies, and insurance companies. Aswani's team is troubled - because this could lead to firms discriminating according to factors such as pregnancy or disability.
Common problem
This is not the first time that anonymized or pseudonymized data has been successfully attributed to individuals. Research performed by MIT in 2015 revealed that previously scrubbed credit card data could be successfully reattributed.
MIT used de-identified data from 10,000 shops containing the details of 1.1 million credit card customers. According to lead researcher Yves-Alexandre de Montjoye, an individual could be singled out easily if specific markers were successfully uncovered. According to MIT these vital markers could be discovered using data from as few as 3 or 4 transactions.
The research demonstrates that data pseudonymization processes are far from foolproof. This is concerning, because even in the EU where GDPR has massively improved people’s data privacy rights - data pseudonymization is touted as a method for firms to process "special category” or sensitive data without breaking the law. Special category data includes genetic data and data concerning health.
Both the new UC Berkeley study and the previous MIT research, demonstrate that the pseudonymization of data may not be enough to secure it indefinitely. This means even the most forward-thinking data privacy bills may not adequately protect citizens against jigsaw attacks.
Legislative updates needed
Aswani and his team of researchers have urged the US government to "revisit and rework” existing HIPPA legislation in order to protect people against the dangers created by AI. New regulations that protect health data are needed, but the Berkeley researchers are worried that US policymakers appear to be going in the wrong direction:
"Ideally, what I'd like to see from this are new regulations or rules that protect health data. But there is actually a big push to even weaken the regulations right now. For instance, the rule-making group for HIPAA has requested comments on increasing data sharing. The risk is that if people are not aware of what's happening, the rules we have will be weakened. And, the fact is, the risks of us losing control of our privacy when it comes to health care are actually increasing and not decreasing."
If this story has made you reconsider your own online security, why not take a look at our best VPN services page for ways you can stay secure online.
Image credits: metamorworks/Shutterstock.com, five trees/Shutterstock.com, Billion Photos/Shutterstock.com.