Databricks Certified Data Engineer Associate — Question 38
A data engineer has developed a data pipeline to ingest data from a JSON source using Auto Loader, but the engineer has not provided any type inference or schema hints in their pipeline. Upon reviewing the data, the data engineer has noticed that all of the columns in the target table are of the string type despite some of the fields only including float or boolean values.
Which of the following describes why Auto Loader inferred all of the columns to be of the string type?
Answer options
- A. There was a type mismatch between the specific schema and the inferred schema
- B. JSON data is a text-based format
- C. Auto Loader only works with string data
- D. All of the fields had at least one null value
- E. Auto Loader cannot infer the schema of ingested data
Correct answer: B
Explanation
The correct answer is B because JSON is inherently a text-based format, which leads Auto Loader to interpret all data as strings unless specified otherwise. The other options are incorrect as they either misrepresent Auto Loader's functionality or do not directly relate to the nature of JSON data.