You are planning to migrate your current on-premises Apache Hadoop deployment to the clou…

Question

You are planning to migrate your current on-premises Apache Hadoop deployment to the cloud. You need to ensure that the deployment is as fault-tolerant and cost-effective as possible for long-running batch jobs. You want to use a managed service. What should you do?

Accepted Answer

Correct answer: A. A. Deploy a Dataproc cluster. Use a standard persistent disk and 50% preemptible workers. Store data in Cloud Storage, and change references in scripts from hdfs:// to gs:// — Option A is correct because deploying a Dataproc cluster with standard persistent disks and 50% preemptible workers optimizes for cost-effectiveness and fault tolerance in cloud environments. Options B and C do not offer the same balance of cost and fault tolerance, while option D uses HDFS, which is not as cost-efficient or suitable for cloud environments compared to Cloud Storage.

Google Cloud Professional Data Engineer — Question 98

Answer options

Correct answer: A

Explanation