A financial services company is developing a real-time generative AI (GenAI) assistant to…

Question

A financial services company is developing a real-time generative AI (GenAI) assistant to support human call center agents. The GenAI assistant must transcribe live customer speech, analyze context, and provide incremental suggestions to call center agents while a customer is still speaking. To preserve responsiveness, the GenAI assistant must maintain end-to-end latency under 1 second from speech to initial response display. The architecture must use only managed AWS services and must support bidirectional streaming to ensure that call center agents receive updates in real time.
Which solution will meet these requirements?

Accepted Answer

Correct answer: B. B. Use Amazon Transcribe streaming with partial results enabled to deliver fragments of transcribed text before customers finish speaking. Forward text fragments to Amazon Bedrock by using the InvokeModelWithResponseStream API. Stream responses to call center agents through an Amazon API Gateway WebSocket API. — Option B is the correct answer because it uses Amazon Transcribe streaming with partial results, enabling real-time transcription and immediate feedback to agents. The other options either do not meet the latency requirement, use batch processing which is not suitable for real-time needs, or involve architectures that do not support the necessary bidirectional streaming.

AWS Certified Generative AI – Professional (AIP-C01) — Question 40

Answer options

Correct answer: B

Explanation