Google Cloud Professional Data Engineer — Question 38

You are implementing several batch jobs that must be executed on a schedule. These jobs have many interdependent steps that must be executed in a specific order. Portions of the jobs involve executing shell scripts, running Hadoop jobs, and running queries in BigQuery. The jobs are expected to run for many minutes up to several hours. If the steps fail, they must be retried a fixed number of times. Which service should you use to manage the execution of these jobs?

Answer options

Correct answer: D

Explanation

Cloud Composer is designed for orchestrating complex workflows and managing dependencies between tasks, making it ideal for this scenario. Cloud Scheduler is mainly for triggering tasks at scheduled times but does not handle complex workflows. Cloud Dataflow is intended for stream and batch data processing rather than job orchestration, and Cloud Functions is for executing single-purpose functions instead of managing interdependent job sequences.