Google Cloud Professional Data Engineer — Question 218
You are preparing an organization-wide dataset. You need to preprocess customer data stored in a restricted bucket in Cloud Storage. The data will be used to create consumer analyses. You need to comply with data privacy requirements.
What should you do?
Answer options
- A. Use Dataflow and the Cloud Data Loss Prevention API to mask sensitive data. Write the processed data in BigQuery.
- B. Use customer-managed encryption keys (CMEK) to directly encrypt the data in Cloud Storage. Use federated queries from BigQuery. Share the encryption key by following the principle of least privilege.
- C. Use the Cloud Data Loss Prevention API and Dataflow to detect and remove sensitive fields from the data in Cloud Storage. Write the filtered data in BigQuery.
- D. Use Dataflow and Cloud KMS to encrypt sensitive fields and write the encrypted data in BigQuery. Share the encryption key by following the principle of least privilege.
Correct answer: A
Explanation
The correct answer is A because it utilizes Dataflow and the Cloud Data Loss Prevention API to effectively mask sensitive data, ensuring compliance with privacy requirements while writing the processed data to BigQuery. Options B and D involve encryption rather than data masking, which does not meet the specific requirement of preprocessing for consumer analysis. Option C, while similar, does not mask data but removes sensitive fields instead, which might not suffice for the intended analysis.