Google Cloud Associate Data Practitioner — Question 4
You want to process and load a daily sales CSV file stored in Cloud Storage into BigQuery for downstream reporting. You need to quickly build a scalable data pipeline that transforms the data while providing insights into data quality issues. What should you do?
Answer options
- A. Create a batch pipeline in Cloud Data Fusion by using a Cloud Storage source and a BigQuery sink.
- B. Load the CSV file as a table in BigQuery, and use scheduled queries to run SQL transformation scripts.
- C. Load the CSV file as a table in BigQuery. Create a batch pipeline in Cloud Data Fusion by using a BigQuery source and sink.
- D. Create a batch pipeline in Dataflow by using the Cloud Storage CSV file to BigQuery batch template.
Correct answer: A
Explanation
The correct answer is A because using Cloud Data Fusion allows for a flexible and scalable pipeline that can handle data transformations and quality checks effectively. Options B and C focus on loading the data into BigQuery first, which may not provide the same level of transformation and quality insights before loading. Option D, while using Dataflow, does not leverage the capabilities of Cloud Data Fusion for data quality insights as effectively as option A does.