AWS Certified Data Engineer – Associate (DEA-C01) — Question 150
A company stores customer data in an Amazon S3 bucket. Multiple teams in the company want to use the customer data for downstream analysis. The company needs to ensure that the teams do not have access to personally identifiable information (PII) about the customers.
Which solution will meet this requirement with LEAST operational overhead?
Answer options
- A. Use Amazon Macie to create and run a sensitive data discovery job to detect and remove PII.
- B. Use S3 Object Lambda to access the data, and use Amazon Comprehend to detect and remove PII.
- C. Use Amazon Data Firehose and Amazon Comprehend to detect and remove PII.
- D. Use an AWS Glue DataBrew job to store the PII data in a second S3 bucket. Perform analysis on the data that remains in the original S3 bucket.
Correct answer: B
Explanation
Option B is correct because S3 Object Lambda allows for on-the-fly data transformation, enabling the removal of PII when accessed by the teams, thus minimizing operational overhead. Option A, while effective, requires separate setup and management of Macie jobs. Options C and D involve additional complexities and overhead, such as data processing pipelines or creating separate storage solutions, which are not as efficient as using S3 Object Lambda.