Google Cloud Professional Data Engineer — Question 204

You have thousands of Apache Spark jobs running in your on-premises Apache Hadoop cluster. You want to migrate the jobs to Google Cloud. You want to use managed services to run your jobs instead of maintaining a long-lived Hadoop cluster yourself. You have a tight timeline and want to keep code changes to a minimum. What should you do?

Answer options

Correct answer: D

Explanation

The correct answer is D because moving data to Cloud Storage and running jobs on Dataproc allows for a seamless migration of Spark jobs with minimal code changes. Option A involves significant changes to the codebase by converting to SQL, while B requires rewriting jobs in Apache Beam, which is not ideal for a tight timeline. Option C does not utilize managed services and requires more maintenance.