AWS Certified Generative AI – Professional (AIP-C01) — Question 3
A company is developing a customer support application that uses Amazon Bedrock foundation models (FMs) to provide real-time AI assistance to the company's employees. The application must display AI-generated responses character by character as the responses are generated. The application needs to support thousands of concurrent users with minimal latency. The responses typically take 15 to 45 seconds to finish.
Which solution will meet these requirements?
Answer options
- A. Configure an Amazon API Gateway WebSocket API with an AWS Lambda integration. Configure the WebSocket API to invoke the Amazon Bedrock InvokeModelWithResponseStream API and stream partial responses through WebSocket connections.
- B. Configure an Amazon API Gateway REST API with an AWS Lambda integration. Configure the REST API to invoke the Amazon Bedrock standard InvokeModel API and implement frontend client-side polling every 100 ms for complete response chunks.
- C. Implement direct frontend client connections to Amazon Bedrock by using IAM user credentials and the InvokeModelWithResponseStream API without any intermediate gateway or proxy layer.
- D. Configure an Amazon API Gateway HTTP API with an AWS Lambda integration. Configure the HTTP API to cache complete responses in an Amazon DynamoDB table and serve the responses through multiple paginated GET requests to frontend clients.
Correct answer: A
Explanation
Option A is correct because using a WebSocket API allows for real-time streaming of responses character by character, which is essential for the application's requirements. Option B relies on polling, which introduces unnecessary latency and does not stream responses in real-time. Option C lacks a gateway, which is crucial for managing multiple connections and ensuring scalability. Option D does not support real-time streaming and instead focuses on retrieving complete responses, which does not meet the application's need for character-by-character display.