You migrated your on-premises Apache Hadoop Distributed File System (HDFS) data lake to C…

Question

You migrated your on-premises Apache Hadoop Distributed File System (HDFS) data lake to Cloud Storage. The data scientist team needs to process the data by using Apache Spark and SQL. Security policies need to be enforced at the column level. You need a cost-effective solution that can scale into a data mesh. What should you do?

Accepted Answer

Correct answer: B. B. 1. Define a BigLake table.
2. Create a taxonomy of policy tags in Data Catalog.
3. Add policy tags to columns.
4. Process with the Spark-BigQuery connector or BigQuery SQL. — The correct answer is B because it utilizes BigLake, which allows for unified access to data across storage systems while supporting column-level security through policy tags in Data Catalog. Options A and D do not provide the flexibility and scalability needed for a data mesh, and option C, while similar, involves loading data into BigQuery, which may not be the most cost-effective or scalable approach.

Google Cloud Professional Data Engineer — Question 214

Answer options

Correct answer: B

Explanation