Data Engineering on Microsoft Azure — Question 38

You have an Azure Synapse Analytics Apache Spark pool named Pool1.
You plan to load JSON files from an Azure Data Lake Storage Gen2 container into the tables in Pool1. The structure and data types vary by file.
You need to load the files into the tables. The solution must maintain the source data types.
What should you do?

Answer options

Correct answer: D

Explanation

The correct answer is D because PySpark allows you to efficiently load JSON files while maintaining their original data types, handling the varying structures effectively. Options A and B do not directly facilitate the loading process with type preservation, and option C involves a serverless SQL pool, which may not support the required data type integrity for JSON files.