AWS Certified Data Analytics – Specialty — Question 58
A marketing company is using Amazon EMR clusters for its workloads. The company manually installs third-party libraries on the clusters by logging in to the master nodes. A data analyst needs to create an automated solution to replace the manual process.
Which options can fulfill these requirements? (Choose two.)
Answer options
- A. Place the required installation scripts in Amazon S3 and execute them using custom bootstrap actions.
- B. Place the required installation scripts in Amazon S3 and execute them through Apache Spark in Amazon EMR.
- C. Install the required third-party libraries in the existing EMR master node. Create an AMI out of that master node and use that custom AMI to re-create the EMR cluster.
- D. Use an Amazon DynamoDB table to store the list of required applications. Trigger an AWS Lambda function with DynamoDB Streams to install the software.
- E. Launch an Amazon EC2 instance with Amazon Linux and install the required third-party libraries on the instance. Create an AMI and use that AMI to create the EMR cluster.
Correct answer: A, E
Explanation
Option A is correct because using custom bootstrap actions in Amazon EMR allows for automation of library installation during cluster creation. Option E is also correct as creating an AMI after installing the necessary libraries on an EC2 instance can facilitate the replication of the environment. The other options do not provide a valid automated solution for the specific needs of the EMR cluster setup.