AWS Certified Data Engineer – Associate (DEA-C01) — Question 148
A company needs to load customer data that comes from a third party into an Amazon Redshift data warehouse. The company stores order data and product data in the same data warehouse. The company wants to use the combined dataset to identify potential new customers.
A data engineer notices that one of the fields in the source data includes values that are in JSON format.
How should the data engineer load the JSON data into the data warehouse with the LEAST effort?
Answer options
- A. Use the SUPER data type to store the data in the Amazon Redshift table.
- B. Use AWS Glue to flatten the JSON data and ingest it into the Amazon Redshift table.
- C. Use Amazon S3 to store the JSON data. Use Amazon Athena to query the data.
- D. Use an AWS Lambda function to flatten the JSON data. Store the data in Amazon S3.
Correct answer: A
Explanation
The correct answer is A because the SUPER data type in Amazon Redshift allows for directly storing semi-structured data like JSON without needing to flatten it, thus requiring the least effort. Options B and D involve additional processing steps to flatten the JSON data, which adds complexity and effort. Option C does not load the data into Redshift directly, which is contrary to the requirement.