A Machine Learning Engineer needs a continuous deployment pipeline for their models hoste…

Question

A Machine Learning Engineer needs a continuous deployment pipeline for their models hosted on Databricks Model Serving. The deployment automation should execute after a model is trained and registered using MLflow. The goal of the automation is to deploy the latest version of the model from the MLflow Model Registry only if the model can meet the company’s strict latency requirements (P95 < 300ms) while serving production traffic. How can the engineer validate that new models meet their latency requirements when served in production?

Accepted Answer

Correct answer: A. A. A/B test the latest model with Databricks Model Serving so that the latest model receives 5% of production traffic and the current model receives the rest. Use inference tables to calculate P95 latency and verify it is less than 300ms. — Option A is the correct answer because A/B testing allows for real-time comparisons between the new and existing models under actual production conditions, ensuring the new model meets latency requirements. Option B, while useful, does not provide a comparison with the current model in production. Options C and D focus on retrieving metrics from MLflow without actual production testing, which does not guarantee the model meets latency criteria in a live environment.

Databricks Certified Machine Learning Professional — Question 83

Answer options

Correct answer: A

Explanation