Designing an Azure Data Solution (legacy) — Question 19
You are designing a solution that will copy Parquet files stored in an Azure Blob storage account to an Azure Data Lake Storage Gen2 account.
The data will be loaded daily to the data lake and will use a folder structure of {Year}/{Month}/{Day}/.
You need to design a daily Azure Data Factory data load to minimize the data transfer between the two accounts.
Which two configurations should you include in the design? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.
Answer options
- A. Delete the files in the destination before loading new data.
- B. Filter by the last modified date of the source files.
- C. Delete the source files after they are copied.
- D. Specify a file naming pattern for the destination.
Correct answer: B, C
Explanation
The correct answers are B and C. Filtering by the last modified date helps to ensure that only new or updated files are transferred, reducing unnecessary data transfer. Deleting the source files after copying them is also efficient, as it ensures that only the necessary files are retained in the source, minimizing future transfer needs. Options A and D do not contribute to reducing data transfer effectively.