AWS Certified Machine Learning – Specialty — Question 321

A global company receives and processes hundreds of documents daily. The documents are in printed .pdf format or .jpg format.

A machine learning (ML) specialist wants to build an automated document processing workflow to extract text from specific fields from the documents and to classify the documents. The ML specialist wants a solution that requires low maintenance.

Which solution will meet these requirements with the LEAST operational effort?

Answer options

Correct answer: D

Explanation

Amazon Textract is a fully managed service that automatically extracts text and data from scanned documents, eliminating the operational overhead of managing custom OCR models like PaddleOCR on Amazon SageMaker. Amazon Comprehend is a managed natural language processing service ideal for classifying text-based documents, whereas Amazon Rekognition is optimized for computer vision tasks on images/videos rather than text classification. Combining Textract and Comprehend provides a serverless, low-maintenance solution that minimizes operational effort.