Data Engineering on Microsoft Azure — Question 4

You are designing the folder structure for an Azure Data Lake Storage Gen2 container.
Users will query data by using a variety of services including Azure Databricks and Azure Synapse Analytics serverless SQL pools. The data will be secured by subject area. Most queries will include data from the current year or current month.
Which folder structure should you recommend to support fast queries and simplified folder security?

Answer options

Correct answer: D

Explanation

The correct answer is D because it organizes data by subject area first, allowing for easier access control, while also structuring the date in a way that aligns with how most queries will be executed, focusing on the current year and month. The other options either complicate security by not prioritizing subject area or do not align with the most common query patterns, making them less efficient.