Databricks Certified Data Engineer Professional — Question 1

The data engineering team has configured a job to process customer requests to be forgotten (have their data deleted). All user data that needs to be deleted is stored in Delta Lake tables using default table settings.
The team has decided to process all deletions from the previous week as a batch job at 1am each Sunday. The total duration of this job is less than one hour. Every Monday at 3am, a batch job executes a series of VACUUM commands on all Delta Lake tables throughout the organization.
The compliance officer has recently learned about Delta Lake's time travel functionality. They are concerned that this might allow continued access to deleted data.
Assuming all delete logic is correctly implemented, which statement correctly addresses this concern?

Answer options

Correct answer: E

Explanation

The correct answer is E because the default data retention threshold in Delta Lake is indeed 7 days, meaning that deleted records will remain accessible until the VACUUM job runs. Option A is incorrect as it refers to a 24-hour retention, which does not apply here. Option B also incorrectly states a 24-hour threshold. Options C and D are incorrect because they misrepresent how time travel and delete operations function within Delta Lake.