AWS Certified Big Data – Specialty — Question 3
A company hosts a portfolio of e-commerce websites across the Oregon, N. Virginia, Ireland, and Sydney
AWS regions. Each site keeps log files that capture user behavior. The company has built an application that generates batches of product recommendations with collaborative filtering in Oregon. Oregon was selected because the flagship site is hosted there and provides the largest collection of data to train machine learning models against. The other regions do NOT have enough historic data to train accurate machine learning models.
Which set of data processing steps improves recommendations for each region?
Answer options
- A. Use the e-commerce application in Oregon to write replica log files in each other region.
- B. Use Amazon S3 bucket replication to consolidate log entries and build a single model in Oregon.
- C. Use Kinesis as a buffer for web logs and replicate logs to the Kinesis stream of a neighboring region.
- D. Use the CloudWatch Logs agent to consolidate logs into a single CloudWatch Logs group.
Correct answer: D
Explanation
The correct answer is D because using the CloudWatch Logs agent allows the logs from different regions to be consolidated efficiently into a single group, making it easier to analyze and derive insights for improving recommendations. The other options either do not effectively consolidate data or rely on processes that do not enhance the model's training capability across regions.