Google Cloud Associate Data Practitioner — Question 13

You have a Dataproc cluster that performs batch processing on data stored in Cloud Storage. You need to schedule a daily Spark job to generate a report that will be emailed to stakeholders. You need a fully-managed solution that is easy to implement and minimizes complexity. What should you do?

Answer options

Correct answer: B

Explanation

The correct answer is B because Dataproc workflow templates provide a built-in way to define, schedule, and manage Spark jobs, along with the capability to automate email reporting. Option A, while using Cloud Composer, adds unnecessary complexity compared to the straightforward approach of using workflow templates. Options C and D involve using Cloud Run functions, which is less efficient for this scenario since the workflow templates are specifically designed for orchestrating jobs in Dataproc.