Databricks Certified Machine Learning Professional — Question 46
A machine learning engineer wants to log and deploy a model as an MLflow pyfunc model. They have custom preprocessing that needs to be completed on feature variables prior to fitting the model or computing predictions using that model. They decide to wrap this preprocessing in a custom model class ModelWithPreprocess, where the preprocessing is performed when calling fit and when calling predict. They then log the fitted model of the ModelWithPreprocess class as a pyfunc model.
Which of the following is a benefit of this approach when loading the logged pyfunc model for downstream deployment?
Answer options
- A. The pyfunc model can be used to deploy models in a parallelizable fashion
- B. The same preprocessing logic will automatically be applied when calling fit
- C. The same preprocessing logic will automatically be applied when calling predict
- D. This approach has no impact when loading the logged pyfunc model for downstream deployment
- E. There is no longer a need for pipeline-like machine learning objects
Correct answer: C
Explanation
The correct answer is C because the custom model class ensures that preprocessing is consistently applied during predictions, which is crucial for maintaining accuracy. Options A and B are not correct as they do not address the specific benefit related to the predict method. Option D incorrectly states that there is no impact, while E misrepresents the need for structured machine learning workflows.