A company is using Snowpipe to bring in millions of rows every day of Change Data Capture…

Question

A company is using Snowpipe to bring in millions of rows every day of Change Data Capture (CDC) into a Snowflake staging table on a real-time basis. The CDC needs to get processed and combined with other data in Snowflake and land in a final table as part of the full data pipeline.
How can a Data Engineer MOST efficiently process the incoming CDC on an ongoing basis?

Accepted Answer

Correct answer: A. A. Create a stream on the staging table and schedule a task that transforms data from the stream, only when the stream has data. — Option A is the most efficient because it allows for real-time processing of changes as they occur, utilizing Snowflake's stream functionality. Options B and D would require processing the entire dataset each time, which is less efficient. Option C, while it processes delta data, may not capture changes in real time and could lead to delays in data availability.

SnowPro Advanced: Data Engineer — Question 49

Answer options

Correct answer: A

Explanation