Databricks Certified Machine Learning Associate — Question 34

A data scientist is calculating the importance of features as a part of an MLflow run. The feature importance values are being stored in the pandas DataFrame importance_df and being written as a CSV to the DBFS location importance_path. They would like to log these values with their active MLflow run.

Which of the following lines of code can the data scientist use to log the feature importance values with their active MLflow run?

Answer options

Correct answer: C

Explanation

The correct answer is C because mlflow.log_artifact is designed to log files from a specified path, which is suitable for logging the CSV file located at importance_path. Option A is incorrect as log_metric is meant for logging numerical values, not DataFrames. Option B incorrectly attempts to log a DataFrame directly, while D misuses log_artifact by trying to log a DataFrame instead of a file path.