AWS Certified Data Engineer – Associate (DEA-C01) — Question 199
A company has as JSON file that contains personally identifiable information (PII) data and non-PII data. The company needs to make the data available for querying and analysis.
The non-PII data must be available to everyone in the company. The PII data must be available only to a limited group of employees.
Which solution will meet these requirements with the LEAST operational overhead?
Answer options
- A. Store the JSON file in an Amazon S3 bucket. Configure AWS Glue to split the file into one file that contains the PII data and one file that contains the non-PII data. Store the output files in separate S3 buckets. Grant the required access to the buckets based on the type of user.
- B. Store the JSON file in an Amazon S3 bucket. Use Amazon Macie to identify PII data and to grant access based on the type of user.
- C. Store the JSON file in an Amazon S3 bucket. Catalog the file schema in AWS Lake Formation. Use Lake Formation permissions to provide access to the required data based on the type of user.
- D. Create two Amazon RDS PostgreSQL databases. Load the PII data and the non-PII data into the separate databases. Grant access to the databases based on the type of user.
Correct answer: C
Explanation
Option C is the best choice as AWS Lake Formation allows for fine-grained access control to specific data sets with minimal management effort. Options A and D involve more operational overhead due to the need for file splitting or database management. Option B, while useful for identifying PII, does not provide the structured access control required for this scenario.