Databricks Certified Data Engineer Professional — Question 219
A data engineer is masking the provided data column containing the email address. The goal is to have an output of the same length for all rows, while keeping different outputs for different values.
Which SQL function should be used to achieve this?
Answer options
- A. hash(email)
- B. mask(email, ‘?’)
- C. sha1(‘email’)
- D. sha2(email,0)
Correct answer: A
Explanation
The correct answer is A, as the hash function generates a fixed-length output for any input, thus satisfying the requirement for uniformity in length while ensuring unique outputs for different email values. Options B and D do not guarantee fixed-length outputs for different inputs, and C is incorrect because it hashes a string literal instead of the email column.