AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 177
A company needs to ingest data from data sources into Amazon SageMaker Data Wrangler. The data sources are Amazon S3, Amazon Redshift, and Snowflake. The ingested data must always be up to date with the latest changes in the source systems.
Which solution will meet these requirements?
Answer options
- A. Use direct connections to import data from the data sources into Data Wrangler.
- B. Use cataloged connections to import data from the data sources into Data Wrangler.
- C. Use AWS Glue to extract data from the data sources. Use AWS Glue also to import the data directly into Data Wrangler.
- D. Use AWS Lambda to extract data from the data sources. Use Lambda also to import the data directly into Data Wrangler.
Correct answer: B
Explanation
The correct answer is B because cataloged connections allow for a managed and efficient way to link to data sources, ensuring up-to-date data ingestion. Option A lacks the benefits of management and consistency provided by cataloged connections, C requires additional steps with AWS Glue which is unnecessary for this requirement, and D does not provide the same level of integration and data freshness as cataloged connections.