A company has built a solution by using generative AI. The solution uses large language m…

Question

A company has built a solution by using generative AI. The solution uses large language models (LLMs) to translate training manuals from English into other languages. The company wants to evaluate the accuracy of the solution by examining the text generated for the manuals.
Which model evaluation strategy meets these requirements?

Accepted Answer

Correct answer: A. A. Bilingual Evaluation Understudy (BLEU) — The Bilingual Evaluation Understudy (BLEU) score is specifically designed to evaluate the quality of text that has been translated from one language to another, making it the most suitable choice for this scenario. The other options, such as RMSE, ROUGE, and F1 score, are not tailored for translation quality assessment and focus on different aspects of model performance.

AWS Certified AI Practitioner (AIF-C01) — Question 82

Answer options

Correct answer: A

Explanation