AWS Certified Machine Learning – Specialty — Question 226

An online retail company wants to develop a natural language processing (NLP) model to improve customer service. A machine learning (ML) specialist is setting up distributed training of a Bidirectional Encoder Representations from Transformers (BERT) model on Amazon SageMaker. SageMaker will use eight compute instances for the distributed training.

The ML specialist wants to ensure the security of the data during the distributed training. The data is stored in an Amazon S3 bucket.

Which combination of steps should the ML specialist take to protect the data during the distributed training? (Choose three.)

Answer options

Correct answer: A, C, D

Explanation

Option A is correct because running distributed training in a private VPC with inter-container traffic encryption ensures that data remains secure during processing. Option C is also necessary as creating an S3 VPC endpoint and configuring the respective policies helps secure access to S3 resources. Option D is important since granting read-only access through an IAM role restricts any unintended modifications to the data. Options B, E, and F do not address the specific security needs during distributed training effectively.