Certified Information Privacy Technologist (CIPT) — Question 74
Aadhaar is a unique-identity number of 12 digits issued to all Indian residents based on their biometric and demographic data. The data is collected by the Unique
Identification Authority of India. The Aadhaar database contains the Aadhaar number, name, date of birth, gender and address of over 1 billion individuals.
Which of the following datasets derived from that data would be considered the most de-identified?
Answer options
- A. A count of the years of birth and hash of the person' s gender.
- B. A count of the month of birth and hash of the person's first name.
- C. A count of the day of birth and hash of the person's first initial of their first name.
- D. Account of the century of birth and hash of the last 3 digits of the person's Aadhaar number.
Correct answer: D
Explanation
Option D is the most de-identified because it provides a broad categorization (century) while using a hash of a less distinctive element (the last 3 digits of the Aadhaar number), making it harder to identify individuals. In contrast, the other options involve more specific information like month, day, or parts of names, which can be more easily linked back to individuals.