Databricks Certified Data Engineer Professional — Question 4

A Structured Streaming job deployed to production has been experiencing delays during peak hours of the day. At present, during normal execution, each microbatch of data is processed in less than 3 seconds. During peak hours of the day, execution time for each microbatch becomes very inconsistent, sometimes exceeding 30 seconds. The streaming write is currently configured with a trigger interval of 10 seconds.
Holding all other variables constant and assuming records need to be processed in less than 10 seconds, which adjustment will meet the requirement?

Answer options

Correct answer: E

Explanation

The correct answer is E because decreasing the trigger interval to 5 seconds allows for more frequent processing of microbatches, helping to avoid backlog. Options A and E are similar, but E more directly addresses the prevention of record accumulation. Option B is incorrect as increasing the trigger interval would worsen the delays. Option C is incorrect because the trigger interval can be modified independently of the checkpoint directory. Option D does not solve the problem of processing delays effectively.