Google Cloud Professional Data Engineer — Question 45

Your company is migrating their 30-node Apache Hadoop cluster to the cloud. They want to re-use Hadoop jobs they have already created and minimize the management of the cluster as much as possible. They also want to be able to persist data beyond the life of the cluster. What should you do?

Answer options

Correct answer: D

Explanation

The correct answer is D because using a Cloud Dataproc cluster with the Google Cloud Storage connector allows for the reuse of existing Hadoop jobs and ensures data persistence after the cluster is terminated. Options A and C do not meet the requirement of minimizing management or ensuring data persistence effectively, while B does not utilize the Google Cloud Storage connector. Option E uses Local SSDs, which do not provide the same level of persistence as Google Cloud Storage.