AWS Certified Big Data – Specialty — Question 18
An administrator needs to design the event log storage architecture for events from mobile devices. The event data will be processed by an Amazon EMR cluster daily for aggregated reporting and analytics before being archived.
How should the administrator recommend storing the log data?
Answer options
- A. Create an Amazon S3 bucket and write log data into folders by device. Execute the EMR job on the device folders.
- B. Create an Amazon DynamoDB table partitioned on the device and sorted on date, write log data to table. Execute the EMR job on the Amazon DynamoDB table.
- C. Create an Amazon S3 bucket and write data into folders by day. Execute the EMR job on the daily folder.
- D. Create an Amazon DynamoDB table partitioned on EventID, write log data to table. Execute the EMR job on the table.
Correct answer: A
Explanation
Option A is correct because it allows for organized storage of log data by device, which facilitates targeted processing by the EMR job. Option B is less optimal as it involves using DynamoDB, which is not as suited for large-scale log data processing as S3. Option C, while viable, does not provide the device-specific organization that the EMR job could benefit from. Option D partitions by EventID, which is unnecessary and could complicate data retrieval for reporting.