AWS Certified AI Practitioner (AIF-C01) — Question 2
A company uses Amazon SageMaker for its ML pipeline in a production environment. The company has large input data sizes up to 1 GB and processing times up to 1 hour. The company needs near real-time latency.
Which SageMaker inference option meets these requirements?
Answer options
- A. Real-time inference
- B. Serverless inference
- C. Asynchronous inference
- D. Batch transform
Correct answer: C
Explanation
Asynchronous inference is the best option for handling large input data sizes and longer processing times while still achieving near real-time latency. Real-time inference is not suitable for long processing times, serverless inference does not specifically address the need for handling large data sizes efficiently, and batch transform is designed for processing large datasets in bulk, not for low-latency requirements.