AWS Certified Data Analytics – Specialty — Question 80
A manufacturing company uses Amazon S3 to store its data. The company wants to use AWS Lake Formation to provide granular-level security on those data assets. The data is in Apache Parquet format. The company has set a deadline for a consultant to build a data lake.
How should the consultant create the MOST cost-effective solution that meets these requirements?
Answer options
- A. Run Lake Formation blueprints to move the data to Lake Formation. Once Lake Formation has the data, apply permissions on Lake Formation.
- B. To create the data catalog, run an AWS Glue crawler on the existing Parquet data. Register the Amazon S3 path and then apply permissions through Lake Formation to provide granular-level security.
- C. Install Apache Ranger on an Amazon EC2 instance and integrate with Amazon EMR. Using Ranger policies, create role-based access control for the existing data assets in Amazon S3.
- D. Create multiple IAM roles for different users and groups. Assign IAM roles to different data assets in Amazon S3 to create table-based and column-based access controls.
Correct answer: B
Explanation
The correct answer, B, is effective because it utilizes an AWS Glue crawler to create a data catalog from the existing Parquet data, allowing Lake Formation to apply granular permission settings efficiently. Option A is less cost-effective as it involves moving data to Lake Formation first, which could incur additional costs. Option C requires setting up and managing Apache Ranger, adding complexity and potential costs. Option D could be cumbersome and inefficient compared to using Lake Formation for detailed security management.