SnowPro Advanced: Architect — Question 189
A data ingestion pipeline loads data to a table randomly throughout the day. A requirement states that the data must be processed through a transformation pipeline within 10 minutes from the time that it lands in Snowflake.
What is the MOST cost-effective way to meet this requirement?
Answer options
- A. Use a third-party orchestration tool to execute the transformation pipeline at 5 minute intervals.
- B. Create a series of views that contain the transformation logic, so that the user can query the views and have access to new data as soon as it arrives.
- C. Create a task that runs at 5 minute intervals and executes a merge statement that queries the table. Then process any rows that have arrived since the last time the pipeline was executed.
- D. Create a stream on the landing table. Then create a task to run a transformation pipeline at 5 minute intervals using the SYSTEM$STREAM_HAS_DATA function.
Correct answer: D
Explanation
Option D is correct because it utilizes a stream to monitor new data and a task to trigger the transformation pipeline efficiently every 5 minutes. Option A may incur additional costs with a third-party tool, while Option B does not ensure timely processing, and Option C relies on merge statements which could be less efficient than using streams.