Google Cloud Professional Machine Learning Engineer — Question 132
You are working on a system log anomaly detection model for a cybersecurity organization. You have developed the model using TensorFlow, and you plan to use it for real-time prediction. You need to create a Dataflow pipeline to ingest data via Pub/Sub and write the results to BigQuery. You want to minimize the serving latency as much as possible. What should you do?
Answer options
- A. Containerize the model prediction logic in Cloud Run, which is invoked by Dataflow.
- B. Load the model directly into the Dataflow job as a dependency, and use it for prediction.
- C. Deploy the model to a Vertex AI endpoint, and invoke this endpoint in the Dataflow job.
- D. Deploy the model in a TFServing container on Google Kubernetes Engine, and invoke it in the Dataflow job.
Correct answer: B
Explanation
The correct answer, B, allows for direct integration of the model into the Dataflow job, reducing latency since predictions occur within the same processing pipeline. Options A and D involve additional layers of abstraction that could introduce delays, and C adds network latency by invoking an external endpoint, making them less optimal for minimizing serving latency.