Can Personal Data ever be Anonymous?

Artificial Intelligence (AI) appears to be developing at a speed faster than legislators can keep up with. The General Data Protection Regulation ((EU)2018/679) (GDPR) raises issues with AI technology which requires huge and constant amounts of data, often repurposing and drawing conclusions from that data which its users cannot predict.

Some suggest the solution is to “anonymise” data making it non-personal, meaning that it therefore falls outside of the scope of the GDPR. But, does non-personal data really exist?

 The Difficult Relationship Between the GDPR and AI

AI forms a key part of machine learning and has the ability to gain insights into data available in the online forum which cannot be easily gained through human investigation alone. The technology relies on the analysis of “big data”, huge datasets which can often include personally identifiable data, which it then analyses to reveal patterns, trends and associations- especially relating to human behaviour and interactions. 

Often, there is a need for many SaaS products and online analytics technologies to keep identifiable data for a continuous period of time in order to achieve the best results possible, which raises issues with the obligations under the GDPR for data minimisation (Article 5(1)(c)). It also creates problems around the ability to obtain clear, regular and informed consent as required by the GDPR (Article 7).

AI processes are complex and understandably difficult to comprehend for lay-persons, not up-to-date with the background technologies driving their processes. The requirement under the GDPR that data subjects understand the processing of their personal data and the consequences that can be produced is therefore a tough one. 

AI can use profiling and automated decision making without human intervention, which, under the GDPR, brings much stricter obligations for controllers and processors (Article 22). The right to be forgotten under the GDPR (Article 17) can also be at war with AI processes, which often retains and repurposes data without its provider even being aware. 

Anonymising Data

One solution which has been suggested in order to allow for the development of AI whilst still remaining legally compliant, is to find a way to “anonymise” data.

The GDPR applies only to data which is personally identifiable, meaning if all personally identifiable elements of the data can somehow be removed the GDPR and its strict obligations would no longer apply. 

There were a number of different techniques suggested by the Article 29 Working Party in 2015. This report stated that it considers anonymisation to be compatible with the original purposes of data processing, so long as it is ensured that only anonymised information is produced.

The Article 29 Working Party explained that an effective anonymisation solution prevents any party from singling out an individual from a dataset- this includes linking records within a dataset or between separate datasets and drawing conclusions which could allocate a singular person.

Some methods suggested were:

  • Singling out- the isolation of records which identify an individual inn a dataset

  • Linkability- linking two records to the same data subject so that an attacker could establish data subjects are in the same group but could not single out individuals in the group

  •  Inference- deducing the value of an attribute from the values of a set of other attributes

  • Randomisation- altering the veracity of data to remove the link between data and the individual

  • Noise addition- recording data only to a margin of accuracy

Issues with “Anonymised” Data

Whilst it is required for data to be anonymised so that a person cannot be identified from it in order to avoid GDPR application, there is also a requirement that data is maintained in such a way that its personally identifiable aspects can be recovered, in case a data subject requests them!

In the case of College van burgemeester en wethouders van Rotterdam v M.E.E Rijkeboer, the European Court ruled that a time-limit should be fixed on the storage of data by Member States, which allows data subjects to request the data held on them- past and present. 

There is also difficulty in deciding what constitutes personal data. The Article 29 Working Party warned data controllers that it is their responsibility to delete original identifiable data at event level. Its example was the documentation of travel movements of a particular person: even if their personal details such as name and address were removed, their travel movements could possibly be used to identify them. 

Pseudonymity has been ruled out as an option- even if personal information is substituted for random, false information, conclusions are likely to be drawn which will allow for identifiability. 

There is also the simple issue of cost- the time, effort and money which must be poured into developing a solution to anonymise data and implementing this solution could be huge. 

But does anonymous data really exist?

Researchers at the Imperial College of London have announced the development of an algorithm which can evaluate the likelihood that personally identifiable information (date of birth, sex, hometown, etc.) can be re-gleaned from anonymised data by AI. They have found that the methods suggested by the Article 29 Working Party are far from adequate. 

Even anonymised data sets can be traced back to individuals using machine learning!

Imperial College of London has published an online tool which allows people to see how accurately AI can guess who they are. It shows that, once bought, data can be reverse engineered to re-identify individuals. In America, 99.98% of people could be accurately predicted, even though their data had been anonymised!

AI is a precious tool and it is important for the law to allow for the development and innovation of this tool. However, it seems that the law could be falling behind by placing obligations on tech-entrepreneurs that seem almost impossible to follow given the technologies they use.

Maybe this time, technology doesn’t have the answer?

For more information or for any legal inquiries, don’t hesitate to contact us.

Article by Lily Morrison @ Gerrish Legal, August 2019

Previous
Previous

GDPR: CCTV, Video Surveillance & Facial Recognition

Next
Next

Online Privacy: Is Google Watching You?