Google Cloud Professional Data Engineer — Question 280

Your company currently runs a large on-premises cluster using Spark, Hive, and HDFS in a colocation facility. The cluster is designed to accommodate peak usage on the system; however, many jobs are batch in nature, and usage of the cluster fluctuates quite dramatically. Your company is eager to move to the cloud to reduce the overhead associated with on-premises infrastructure and maintenance and to benefit from the cost savings. They are also hoping to modernize their existing infrastructure to use more serverless offerings in order to take advantage of the cloud. Because of the timing of their contract renewal with the colocation facility, they have only 2 months for their initial migration. How would you recommend they approach their upcoming migration strategy so they can maximize their cost savings in the cloud while still executing the migration in time?

Answer options

Correct answer: B

Explanation

The correct choice is B because migrating to Dataproc with Cloud Storage allows for a more cost-effective and scalable solution while still meeting the migration timeline. Option A is less optimal due to continuing reliance on HDFS, which may not provide the same cost savings as Cloud Storage. Options C and D involve more complex migrations and modernization efforts that may not be feasible within the two-month timeframe.