Google Cloud Professional Data Engineer — Question 207
You have data located in BigQuery that is used to generate reports for your company. You have noticed some weekly executive report fields do not correspond to format according to company standards. For example, report errors include different telephone formats and different country code identifiers. This is a frequent issue, so you need to create a recurring job to normalize the data. You want a quick solution that requires no coding. What should you do?
Answer options
- A. Use Cloud Data Fusion and Wrangler to normalize the data, and set up a recurring job.
- B. Use Dataflow SQL to create a job that normalizes the data, and that after the first run of the job, schedule the pipeline to execute recurrently.
- C. Create a Spark job and submit it to Dataproc Serverless.
- D. Use BigQuery and GoogleSQL to normalize the data, and schedule recurring queries in BigQuery.
Correct answer: A
Explanation
The correct answer is A, as Cloud Data Fusion and Wrangler provide a user-friendly interface for data normalization without the need for coding, making it ideal for quick solutions. Option B requires SQL coding, which contradicts the need for a no-code solution. Option C also involves coding and managing Spark jobs, while Option D requires SQL knowledge and may not be as efficient for recurring normalization tasks.