AWS Certified Machine Learning – Specialty — Question 96
A manufacturer is operating a large number of factories with a complex supply chain relationship where unexpected downtime of a machine can cause production to stop at several factories. A data scientist wants to analyze sensor data from the factories to identify equipment in need of preemptive maintenance and then dispatch a service team to prevent unplanned downtime. The sensor readings from a single machine can include up to 200 data points including temperatures, voltages, vibrations, RPMs, and pressure readings.
To collect this sensor data, the manufacturer deployed Wi-Fi and LANs across the factories. Even though many factory locations do not have reliable or high- speed internet connectivity, the manufacturer would like to maintain near-real-time inference capabilities.
Which deployment architecture for the model will address these business requirements?
Answer options
- A. Deploy the model in Amazon SageMaker. Run sensor data through this model to predict which machines need maintenance.
- B. Deploy the model on AWS IoT Greengrass in each factory. Run sensor data through this model to infer which machines need maintenance.
- C. Deploy the model to an Amazon SageMaker batch transformation job. Generate inferences in a daily batch report to identify machines that need maintenance.
- D. Deploy the model in Amazon SageMaker and use an IoT rule to write data to an Amazon DynamoDB table. Consume a DynamoDB stream from the table with an AWS Lambda function to invoke the endpoint.
Correct answer: B
Explanation
The correct answer is B because deploying the model on AWS IoT Greengrass allows for localized processing of sensor data at each factory, which is essential for near-real-time inference despite unreliable internet connections. Option A relies on internet connectivity to interact with Amazon SageMaker, which does not meet the requirement for real-time analysis in locations with poor connectivity. Option C generates daily reports, which does not provide the required immediacy, and option D involves additional steps and dependencies that could introduce latency, making it less suitable for real-time needs.