Google Cloud Professional Cloud Architect — Question 95
For this question, refer to the TerramEarth case study. A new architecture that writes all incoming data to BigQuery has been introduced. You notice that the data is dirty, and want to ensure data quality on an automated daily basis while managing cost.
What should you do?
Answer options
- A. Set up a streaming Cloud Dataflow job, receiving data by the ingestion process. Clean the data in a Cloud Dataflow pipeline.
- B. Create a Cloud Function that reads data from BigQuery and cleans it. Trigger the Cloud Function from a Compute Engine instance.
- C. Create a SQL statement on the data in BigQuery, and save it as a view. Run the view daily, and save the result to a new table.
- D. Use Cloud Dataprep and configure the BigQuery tables as the source. Schedule a daily job to clean the data.
Correct answer: D
Explanation
The correct answer is D because Cloud Dataprep is specifically designed for data preparation tasks, allowing for automated data cleaning while managing costs effectively. Options A and B involve more complex setups with additional components, which may increase costs and complexity. Option C provides a solution but does not automate the cleaning process effectively compared to Cloud Dataprep.