Google Cloud Associate Data Practitioner — Question 75
You are working on a project that requires analyzing dally social media data. You have 100 GB of JSON formatted data stored in Cloud Storage that keeps growing. You need to transform and load this data into BigQuery for analysis. You want to follow the Google-recommended approach. What should you do?
Answer options
- A. Use Cloud Data Fusion to transfer the data into BigOuery raw tables, and use SQL to transform it.
- B. Use Dataflow to transform the data and write the transformed data to BigQuery.
- C. Manually download the data from Cloud Storage. Use a Python script to transform and upload the data into BigQuery.
- D. Use Cloud Run functions to transform and load the data into BigOuery.
Correct answer: B
Explanation
The correct answer is B, as Dataflow is a fully managed service designed for processing large datasets, making it ideal for transforming and loading big data into BigQuery. Option A is less optimal because using Cloud Data Fusion involves additional complexity and may not be as efficient for this scale. Option C is not recommended as manual downloading and scripting can introduce errors and is not scalable. Option D, while functional, is not the best practice compared to the streamlined approach of Dataflow.