Google Cloud Professional Machine Learning Engineer — Question 205
You are analyzing customer data for a healthcare organization that is stored in Cloud Storage. The data contains personally identifiable information (PII). You need to perform data exploration and preprocessing while ensuring the security and privacy of sensitive fields. What should you do?
Answer options
- A. Use the Cloud Data Loss Prevention (DLP) API to de-identify the PII before performing data exploration and preprocessing.
- B. Use customer-managed encryption keys (CMEK) to encrypt the PII data at rest, and decrypt the PII data during data exploration and preprocessing.
- C. Use a VM inside a VPC Service Controls security perimeter to perform data exploration and preprocessing.
- D. Use Google-managed encryption keys to encrypt the PII data at rest, and decrypt the PII data during data exploration and preprocessing.
Correct answer: A
Explanation
The correct answer is A because using the Cloud Data Loss Prevention (DLP) API allows you to de-identify sensitive PII, ensuring privacy during exploration and preprocessing. Options B and D focus on encryption, which does not address the immediate need for de-identification during data handling. Option C offers a secure environment but does not specifically address the privacy of PII data.