AWS Certified AI Practitioner (AIF-C01) — Question 3
A company wants to use language models to create an application for inference on edge devices. The inference must have the lowest latency possible.
Which solution will meet these requirements?
Answer options
- A. Deploy optimized small language models (SLMs) on edge devices.
- B. Deploy optimized large language models (LLMs) on edge devices.
- C. Incorporate a centralized small language model (SLM) API for asynchronous communication with edge devices.
- D. Incorporate a centralized large language model (LLM) API for asynchronous communication with edge devices.
Correct answer: A
Explanation
The correct answer is A because small language models (SLMs) are specifically designed to be lightweight and optimized for low-latency operations on edge devices. In contrast, large language models (LLMs) tend to require more resources and processing power, making them less suitable for edge deployments where speed is critical.