AWS Certified AI Practitioner (AIF-C01) — Question 253

A company uses Amazon SageMaker AI to generate article summaries in multiple languages. The company needs a metric to evaluate the quality of the summary translations in multiple languages.

Which evaluation metric will meet these requirements?

Answer options

Correct answer: B

Explanation

The correct answer is B, Bilingual evaluation understudy (BLEU), as it specifically measures the quality of translated text by comparing it to a reference translation. ROUGE (A) is more suited for evaluating summarization rather than translation quality, while AUC (C) and Precision (D) are metrics used in different contexts, such as classification tasks, and are not appropriate for evaluating translation quality.