AWS Certified Machine Learning – Specialty — Question 67
A Machine Learning Specialist needs to move and transform data in preparation for training. Some of the data needs to be processed in near-real time, and other data can be moved hourly. There are existing Amazon EMR MapReduce jobs to clean and feature engineering to perform on the data.
Which of the following services can feed data to the MapReduce jobs? (Choose two.)
Answer options
- A. AWS DMS
- B. Amazon Kinesis
- C. AWS Data Pipeline
- D. Amazon Athena
- E. Amazon ES
Correct answer: B, C
Explanation
Amazon Kinesis is suitable for real-time data streaming, making it ideal for near-real-time processing needs. AWS Data Pipeline allows for the orchestration of data movement and transformation tasks on a scheduled basis, fitting the hourly data processing requirement. The other options do not serve the purpose of feeding data directly to MapReduce jobs in the specified manner.