Google Cloud Professional Data Engineer — Question 273

You are loading CSV files from Cloud Storage to BigQuery. The files have known data quality issues, including mismatched data types, such as STRINGs and
INT64s in the same column, and inconsistent formatting of values such as phone numbers or addresses. You need to create the data pipeline to maintain data quality and perform the required cleansing and transformation. What should you do?

Answer options

Correct answer: A

Explanation

The correct answer is A because using Data Fusion allows for the necessary transformations and cleansing of data before it is loaded into BigQuery, which is crucial for maintaining data quality. Options B, C, and D do not address the immediate need for transformation and cleansing before loading, which could lead to data quality issues in the final dataset.