AWS Certified Machine Learning – Specialty — Question 205

A company is building an application that can predict spam email messages based on email text. The company can generate a few thousand human-labeled datasets that contain a list of email messages and a label of "spam" or "not spam" for each email message. A machine learning (ML) specialist wants to use transfer learning with a Bidirectional Encoder Representations from Transformers (BERT) model that is trained on English Wikipedia text data.

What should the ML specialist do to initialize the model to fine-tune the model with the custom data?

Answer options

Correct answer: D

Explanation

The correct answer is D because initializing the model with pretrained weights in all layers allows it to leverage the knowledge gained from the large dataset it was trained on, making it more effective when fine-tuning for the specific task. Replacing the last fully connected layer with a classifier is necessary to adapt the model for the new output labels. Options A and B do not fully utilize the pretrained model's capabilities, and option C starting with random weights would not be effective for transfer learning.