A company is building a predictive maintenance system using real-time data from devices o…

Question

A company is building a predictive maintenance system using real-time data from devices on remote sites. There is no AWS Direct Connect connection or VPN connection between the sites and the company's VPC. The data needs to be ingested in real time from the devices into Amazon S3. Transformation is needed to convert the raw data into clean .csv data to be fed into the machine learning (ML) model. The transformation needs to happen during the ingestion process. When transformation fails, the records need to be stored in a specific location in Amazon S3 for human review. The raw data before transformation also needs to be stored in Amazon S3. How should an ML specialist architect the solution to meet these requirements with the LEAST effort?

Accepted Answer

Correct answer: A. A. Use Amazon Data Firehose with Amazon S3 as the destination. Configure Firehose to invoke an AWS Lambda function for data transformation. Enable source record backup on Firehose. — Amazon Data Firehose natively supports inline data transformation using AWS Lambda and can automatically back up raw source records to an S3 bucket before transformation. When transformations fail, Firehose automatically routes the failed records to an S3 error prefix location, fulfilling all requirements out-of-the-box. Alternatives involving Amazon MSK, ECS workers, or Kinesis Data Streams introduce unnecessary architectural complexity and significantly higher operational overhead.

AWS Certified Machine Learning – Specialty — Question 352

Answer options

Correct answer: A

Explanation