AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 191
A company stores user clickstream data in an Amazon S3 bucket in AWS Account A. The company needs to use the data to train an ML model in Amazon SageMaker AI in AWS Account B. The training will take 10 days.
The company needs to use only private IP addresses in the training. The company also must make sure that no training metadata is shared with AWS.
Which solution will meet these requirements?
Answer options
- A. Set up VPC peering between Account A and Account B. Contact AWS by email to opt out of metadata collection.
- B. Set up a VPC endpoint for the S3 bucket. Set the SageMaker AI OPT_OUT_TRACKING environment variable to 1 in the training job.
- C. Configure a security group policy that is assigned to the S3 bucket in Account A to allow access from only Account B. Create AI services opt-out policies.
- D. Generate presigned URLs with expiration times for the objects that are stored in the S3 bucket. Access the data by using the presigned URLs. Set the SageMaker AI OPT_OUT_TRACKING environment variable to 1 in the training job.
Correct answer: B
Explanation
Option B is the correct choice because setting up a VPC endpoint allows private IP address usage while accessing the S3 bucket, and configuring the OPT_OUT_TRACKING variable ensures that no metadata is shared. Option A does not guarantee the use of private IPs for data access. Option C lacks the necessary configuration to prevent metadata sharing, and option D, while providing access, does not address the requirement to ensure no metadata is collected.