Google Cloud Professional Data Engineer — Question 321

You have several Spark jobs that run on a Cloud Dataproc cluster on a schedule. Some of the jobs run in sequence, and some of the jobs run concurrently. You need to automate this process. What should you do?

Answer options

Correct answer: C

Explanation

The correct answer is C, as creating a Directed Acyclic Graph (DAG) in Cloud Composer allows for the orchestration of complex workflows, managing dependencies between jobs effectively. Option A, while useful, does not provide the same level of automation for diverse job scheduling as a DAG. Option B is not appropriate because initialization actions are not designed for scheduling jobs but rather for configuring the cluster. Option D involves more manual steps and doesn't leverage a managed orchestration service like Cloud Composer.