You are creating a new experiment in Azure Machine Learning Studio. You have a small data…

Question

You are creating a new experiment in Azure Machine Learning Studio. You have a small dataset that has missing values in many columns. The data does not require the application of predictors for each column. You plan to use the Clean Missing Data.
You need to select a data cleaning method.
Which method should you use?

Accepted Answer

Correct answer: A. A. Replace using Probabilistic PCA — The correct answer is A, as Probabilistic PCA is effective in estimating missing values in datasets where the structure is complex, which matches your dataset's needs. The other options are not suitable for simply handling missing values; Normalization is for scaling data, SMOTE is for balancing classes in imbalanced datasets, and MICE is more complex than necessary for this scenario.

Designing and Implementing a Data Science Solution on Azure — Question 13

Answer options

Correct answer: A

Explanation