Databricks Certified Associate Developer for Apache Spark — Question 198
A data engineer is building a Structured Streaming pipeline that streams data from a Kafka topic.
While the priority is to have it run at a minimal latency, the engineer also wants to maintain an Exactly-Once processing guarantee in the pipeline.
Which trigger mode should the engineer use for their writeStream?
Answer options
- A. .trigger(processingTime='1 second')
- B. .trigger(continuous=True)
- C. .trigger(continuous='1 second')
- D. .trigger(availableNow=True)
Correct answer: A
Explanation
The correct choice is A because setting the trigger to processingTime of '1 second' ensures low latency while still allowing for Exactly-Once guarantees. Options B and C, which involve continuous processing, do not provide Exactly-Once guarantees, and option D, availableNow, does not meet the low latency requirement.