AWS Certified Generative AI – Professional (AIP-C01) — Question 19
A financial services company uses an AI application to process financial documents by using Amazon Bedrock. During business hours, the application handles approximately 10,000 requests each hour, which requires consistent throughput.
The company uses the CreateProvisionedModelThroughput API to purchase provisioned throughput. Amazon CloudWatch metrics show that the provisioned capacity is unused while on-demand requests are being throttled. The company finds the following code in the application: python response = bedrock_runtime.invoke_model(modelId="anthropic.claude-v2", body=json.dumps(payload))
The company needs the application to use the provisioned throughput and to resolve the throttling issues.
Which solution will meet these requirements?
Answer options
- A. Increase the number of model units (MUs) in the provisioned throughput configuration.
- B. Replace the model ID parameter with the ARN of the provisioned model that the CreateProvisionedModelThroughput API returns.
- C. Add exponential backoff retry logic to handle throttling exceptions during peak hours.
- D. Modify the application to use the InvokeModelWithResponseStream API instead of the InvokeModel API.
Correct answer: B
Explanation
The correct answer is B because using the ARN of the provisioned model allows the application to leverage the purchased provisioned throughput, ensuring that requests are handled effectively. Options A and C do not directly address the root cause of the throttling issue, which is related to model identification, while D involves a different API that may not resolve the underlying problem of utilizing provisioned throughput.