AWS Certified Solutions Architect – Associate (SAA-C03) — Question 713
A company stores text files in Amazon S3. The text files include customer chat messages, date and time information, and customer personally identifiable information (PII).
The company needs a solution to provide samples of the conversations to an external service provider for quality control. The external service provider needs to randomly pick sample conversations up to the most recent conversation. The company must not share the customer PII with the external service provider. The solution must scale when the number of customer conversations increases.
Which solution will meet these requirements with the LEAST operational overhead?
Answer options
- A. Create an Object Lambda Access Point. Create an AWS Lambda function that redacts the PII when the function reads the file. Instruct the external service provider to access the Object Lambda Access Point.
- B. Create a batch process on an Amazon EC2 instance that regularly reads all new files, redacts the PII from the files, and writes the redacted files to a different S3 bucket. Instruct the external service provider to access the bucket that does not contain the PII. B. Create a web application on an Amazon EC2 instance that presents a list of the files, redacts the PII from the files, and allows the external service provider to download new versions of the files that have the PII redacted.
- D. Create an Amazon DynamoDB table. Create an AWS Lambda function that reads only the data in the files that does not contain PII. Configure the Lambda function to store the non-PII data in the DynamoDB table when a new file is written to Amazon S3. Grant the external service provider access to the DynamoDB table.
Correct answer: A
Explanation
Amazon S3 Object Lambda Access Points allow you to run custom AWS Lambda code during S3 GET requests to modify and redact PII on-the-fly, offering a fully managed and highly scalable solution with minimal operational overhead. Options involving Amazon EC2 instances require managing, patching, and scaling servers, which increases operational burden. Extracting data to Amazon DynamoDB introduces unnecessary architectural complexity, data duplication, and additional storage costs compared to querying S3 directly.