AWS Certified AI Practitioner (AIF-C01) — Question 3

A company wants to use language models to create an application for inference on edge devices. The inference must have the lowest latency possible.
Which solution will meet these requirements?

Answer options

Correct answer: A

Explanation

The correct answer is A because small language models (SLMs) are specifically designed to be lightweight and optimized for low-latency operations on edge devices. In contrast, large language models (LLMs) tend to require more resources and processing power, making them less suitable for edge deployments where speed is critical.