Google Cloud Professional Machine Learning Engineer — Question 147
You have developed an ML model to detect the sentiment of users’ posts on your company's social media page to identify outages or bugs. You are using Dataflow to provide real-time predictions on data ingested from Pub/Sub. You plan to have multiple training iterations for your model and keep the latest two versions live after every run. You want to split the traffic between the versions in an 80:20 ratio, with the newest model getting the majority of the traffic. You want to keep the pipeline as simple as possible, with minimal management required. What should you do?
Answer options
- A. Deploy the models to a Vertex AI endpoint using the traffic-split=0=80, PREVIOUS_MODEL_ID=20 configuration.
- B. Wrap the models inside an App Engine application using the --splits PREVIOUS_VERSION=0.2, NEW_VERSION=0.8 configuration
- C. Wrap the models inside a Cloud Run container using the REVISION1=20, REVISION2=80 revision configuration.
- D. Implement random splitting in Dataflow using beam.Partition() with a partition function calling a Vertex AI endpoint.
Correct answer: A
Explanation
The correct answer is A because deploying the models to a Vertex AI endpoint with the specified traffic-split configuration allows for easy management of versioning while directing the desired traffic ratio. Options B and C involve additional layers of complexity with App Engine and Cloud Run, which may increase management overhead. Option D introduces unnecessary complexity with random splitting in Dataflow, which does not align with the requirement for simplicity.