Designing and Implementing a Data Science Solution on Azure — Question 103
You have a dataset that contains records of patients tested for diabetes. The dataset includes the patient's age.
You plan to create an analysis that will report the mean age value from the differentially private data derived from the dataset.
You need to identify the epsilon value to use in the analysis that minimizes the risk of exposing the actual data.
Which epsilon value should you use?
Answer options
- A. -1.5
- B. -0.5
- C. 0.5
- D. 1.5
Correct answer: C
Explanation
The correct epsilon value is 0.5 because it strikes a balance between privacy and data utility. Negative epsilon values like -1.5 and -0.5 do not provide meaningful privacy guarantees, while an epsilon of 1.5 may compromise privacy more than necessary.