A company wants to use automatic speech recognition (ASR) to transcribe messages that are…

Question

A company wants to use automatic speech recognition (ASR) to transcribe messages that are less than 60 seconds long from a voicemail-style application. The company requires the correct identification of 200 unique product names, some of which have unique spellings or pronunciations.
The company has 4,000 words of Amazon SageMaker Ground Truth voicemail transcripts it can use to customize the chosen ASR model. The company needs to ensure that everyone can update their customizations multiple times each hour.
Which approach will maximize transcription accuracy during the development phase?

Accepted Answer

Correct answer: C. C. Create a custom vocabulary file containing each product name with phonetic pronunciations, and use it with Amazon Transcribe to perform the ASR customization. Analyze the transcripts and manually update the custom vocabulary file to include updated or additional entries for those names that are not being correctly identified. — Option C is correct because creating a custom vocabulary with phonetic pronunciations directly addresses the unique spellings and pronunciations of the product names, improving recognition accuracy. Options A and B do not specifically focus on the phonetic aspects required for accurate transcription of the unique product names, while option D, although useful, involves a more complex process of building a language model which may not be as effective in the immediate development phase.

AWS Certified Machine Learning – Specialty — Question 168

Answer options

Correct answer: C

Explanation