Google Cloud Professional Data Engineer — Question 326

You work for an advertising company, and you've developed a Spark ML model to predict click-through rates at advertisement blocks. You've been developing everything at your on-premises data center, and now your company is migrating to Google Cloud. Your data center will be closing soon, so a rapid lift-and-shift migration is necessary. However, the data you've been using will be migrated to migrated to BigQuery. You periodically retrain your Spark ML models, so you need to migrate existing training pipelines to Google Cloud. What should you do?

Answer options

Correct answer: C

Explanation

The correct answer, C, allows you to use Dataproc, which is designed for running Spark jobs on Google Cloud, and enables you to access data directly from BigQuery efficiently. Option A is incorrect because Vertex AI is not specifically designed for Spark ML models. Option B requires a complete rewrite of models, which is not necessary. Option D involves more overhead by spinning up a Spark cluster on Compute Engine, which is less efficient than using Dataproc.