Google Cloud Professional Data Engineer — Question 130

You want to create a machine learning model using BigQuery ML and create an endpoint for hosting the model using Vertex AI. This will enable the processing of continuous streaming data in near-real time from multiple vendors. The data may contain invalid values. What should you do?

Answer options

Correct answer: D

Explanation

The correct answer is D, as using Dataflow allows for data processing and sanitization before it is stored in BigQuery, accommodating invalid values. Option A suggests using an ingestion dataset without processing, which may not handle invalid values properly. Option B merely suggests streaming inserts without addressing data quality. Option C processes data but lacks the sanitization capability that Dataflow provides, making it less suitable for handling invalid data.