Databricks Certified Data Engineer Associate — Question 126
An organization has data stored across multiple external systems, including MySQL, Amazon Redshift, and Google BigQuery. The data engineer wants to perform analytics without ingesting directly into Databricks, ensuring unified governance and minimizing data duplication.
Which feature of Databricks enables querying these external data sources while maintaining centralized governance?
Answer options
- A. Delta Lake
- B. Lakehouse Federation
- C. MLflow
- D. Databricks Connect
Correct answer: B
Explanation
Lakehouse Federation allows querying of external data sources while upholding centralized governance, making it the correct choice. Delta Lake is primarily focused on data storage and management, MLflow is used for managing machine learning workflows, and Databricks Connect is for connecting Databricks with local development environments, none of which fulfill the requirement for querying external data sources with governance.