Databricks Certified Machine Learning Professional — Question 67

A machine learning engineer has developed a model and registered it using the FeatureStoreClient fs. The model has model URI model_uri. The engineer now needs to perform batch inference on the training set logged with the model, but a few of the feature values in the column spend have since been updated and arc present in the customer-level Spark DataFrame spark_df. The customer_id column is the primary key of spark_df and the training set used when training and logging the model.

Which code block can be used to compute predictions for the training set while overwriting its old spend values with the new spend values from spark_df?

Answer options

Correct answer: D

Explanation

Option D is correct because it first retrieves the updated feature values from spark_df and then uses them to perform batch inference with fs.score_batch, ensuring that the model uses the latest data. Options A and B do not account for updating the spend values, leading to potentially outdated predictions. Option C incorrectly uses 'get_updated_feature' which is singular and may not retrieve all necessary updates.