Certified Information Privacy Technologist (CIPT) — Question 74

Aadhaar is a unique-identity number of 12 digits issued to all Indian residents based on their biometric and demographic data. The data is collected by the Unique
Identification Authority of India. The Aadhaar database contains the Aadhaar number, name, date of birth, gender and address of over 1 billion individuals.
Which of the following datasets derived from that data would be considered the most de-identified?

Answer options

Correct answer: D

Explanation

Option D is the most de-identified because it provides a broad categorization (century) while using a hash of a less distinctive element (the last 3 digits of the Aadhaar number), making it harder to identify individuals. In contrast, the other options involve more specific information like month, day, or parts of names, which can be more easily linked back to individuals.