AWS Certified Machine Learning – Specialty — Question 328

A bank has collected customer data for 10 years in CSV format. The bank stores the data in an on-premises server. A data science team wants to use Amazon SageMaker to build and train a machine learning (ML) model to predict churn probability. The team will use the historical data. The data scientists want to perform data transformations quickly and to generate data insights before the team builds a model for production.

Which solution will meet these requirements with the LEAST development effort?

Answer options

Correct answer: B

Explanation

SageMaker Data Wrangler does not support direct local file uploads to its console for large datasets, requiring the data to first be placed in a supported storage service like Amazon S3. Option B is correct because SageMaker Data Wrangler provides built-in data visualization and insight generation tools, which eliminates the need for additional development work like creating Amazon QuickSight dashboards or writing custom visualization code in SageMaker Studio notebooks.