Google Cloud Associate Data Practitioner — Question 72
Your company wants to implement a data transformation (ETL) pipeline for their BigQuery data warehouse. You need to identify a managed transformation solution that allows users to develop with SQL and JavaScript, has version control, allows for modular code, and has data quality checks. What should you do?
Answer options
- A. Use Dataform to define the transformations in SQLX.
- B. Use Dataproc to create an Apache Spark cluster and implement the transformations by using PySpark SQL.
- C. Create a Cloud Composer environment, and orchestrate the transformations by using the BigQueryInsertJob operator.
- D. Create BigQuery scheduled queries to define the transformations in SQL.
Correct answer: A
Explanation
The correct answer is A because Dataform is specifically designed for managing data transformations and meets all the requirements mentioned, including SQLX support, version control, modularity, and data quality checks. Options B and C do not offer the same level of integration for SQL and JavaScript or the specific features required, while option D lacks modularity and version control.