AWS Certified Solutions Architect – Professional — Question 718
A company is deploying a new cluster for big data analytics on AWS. The cluster will run across many Linux Amazon EC2 instances that are spread across multiple Availability Zones.
All of the nodes in the cluster must have read and write access to common underlying file storage. The file storage must be highly available, must be resilient, must be compatible with the Portable Operating System Interface (POSIX), and must accommodate high levels of throughput.
Which storage solution will meet these requirements?
Answer options
- A. Provision an AWS Storage Gateway file gateway NFS file share that is attached to an Amazon S3 bucket. Mount the NFS file share on each EC2 instance in the cluster.
- B. Provision a new Amazon Elastic File System (Amazon EFS) file system that uses General Purpose performance mode. Mount the EFS file system on each EC2 instance in the cluster.
- C. Provision a new Amazon Elastic Block Store (Amazon EBS) volume that uses the io2 volume type. Attach the EBS volume to all of the EC2 instances in the cluster.
- D. Provision a new Amazon Elastic File System (Amazon EFS) file system that uses Max I/O performance mode. Mount the EFS file system on each EC2 instance in the cluster.
Correct answer: D
Explanation
Amazon EFS is a highly available, POSIX-compliant shared file system that allows concurrent read and write access from multiple EC2 instances across different Availability Zones. For big data analytics workloads with highly parallelized operations, EFS in Max I/O performance mode is the ideal choice because it scales to support higher aggregate throughput and IOPS than General Purpose mode. Amazon EBS io2 volumes do not support multi-attach across multiple Availability Zones, and AWS Storage Gateway is not optimized for high-throughput cluster file sharing.