Google Cloud Professional Data Engineer — Question 249
You have a data analyst team member who needs to analyze data by using BigQuery. The data analyst wants to create a data pipeline that would load 200 CSV files with an average size of 15MB from a Cloud Storage bucket into BigQuery daily. The data needs to be ingested and transformed before being accessed in BigQuery for analysis. You need to recommend a fully managed, no-code solution for the data analyst. What should you do?
Answer options
- A. Create a Cloud Run function and schedule it to run daily using Cloud Scheduler to load the data into BigQuery.
- B. Use the BigQuery Data Transfer Service to load files from Cloud Storage to BigQuery, create a BigQuery job which transforms the data using BigQuery SQL and schedule it to run daily.
- C. Build a custom Apache Beam pipeline and run it on Dataflow to load the file from Cloud Storage to BigQuery and schedule it to run daily using Cloud Composer.
- D. Create a pipeline by using BigQuery pipelines and schedule it to load the data into BigQuery daily.
Correct answer: D
Explanation
The correct answer is D because BigQuery pipelines offer a fully managed, no-code solution for loading and transforming data. Option A involves creating a Cloud Run function which requires custom code, while option B, although it suggests using a managed service, still requires setting up a job and isn't as streamlined as option D. Option C involves building a custom pipeline with Apache Beam, which contradicts the requirement for a no-code solution.