A company wants to use language models to create an application for inference on edge dev…

Question

A company wants to use language models to create an application for inference on edge devices. The inference must have the lowest latency possible. Which solution will meet these requirements?

Accepted Answer

Correct answer: A. A. Deploy optimized small language models (SLMs) on edge devices. — The correct answer is A because small language models (SLMs) are specifically designed to be lightweight and optimized for low-latency operations on edge devices. In contrast, large language models (LLMs) tend to require more resources and processing power, making them less suitable for edge deployments where speed is critical.

AWS Certified AI Practitioner (AIF-C01) — Question 3

Answer options

Correct answer: A

Explanation