Databricks Certified Machine Learning Professional — Question 30
A data scientist has developed and logged a scikit-learn random forest model model, and then they ended their Spark session and terminated their cluster. After starting a new cluster, they want to review the feature_importances_ of the original model object.
Which of the following lines of code can be used to restore the model object so that feature_importances_ is available?
Answer options
- A. mlflow.load_model(model_uri)
- B. client.list_artifacts(run_id)["feature-importances.csv"]
- C. mlflow.sklearn.load_model(model_uri)
- D. This can only be viewed in the MLflow Experiments UI
- E. client.pyfunc.load_model(model_uri)
Correct answer: C
Explanation
The correct answer is C, as mlflow.sklearn.load_model(model_uri) is specifically designed to load scikit-learn models, allowing access to attributes like feature_importances_. Options A and E are for loading models in a more general sense or for different model types, while B only retrieves artifacts and D states that the information is not retrievable programmatically.