Databricks Certified Data Engineer Associate — Question 111

A data engineering team has two tables. The first table march_transactions is a collection of all retail transactions in the month of March. The second table april_transactions is a collection of all retail transactions in the month of April. There are no duplicate records between the tables.

Which of the following commands should be run to create a new table all_transactions that contains all records from march_transactions and april_transactions without duplicate records?

Answer options

Correct answer: B

Explanation

The correct answer is B because using UNION combines the results from both tables and eliminates duplicates, which is the requirement. Options A and C incorrectly use JOIN operations, which are not suitable for merging data without duplicates in this context. Option D uses INTERSECT, which would only retrieve records that are present in both tables, not all unique records.