AWS Certified Solutions Architect – Professional (SAP-C02) — Question 394
Accompany is deploying a new cluster for big data analytics on AWS. The cluster will run across many Linux Amazon EC2 instances that are spread across multiple Availability Zones.
All of the nodes in the cluster must have read and write access to common underlying file storage. The file storage must be highly available, must be resilient, must be compatible with the Portable Operating System Interface (POSIX), and must accommodate high levels of throughput.
Which storage solution will meet these requirements?
Answer options
- A. Provision an AWS Storage Gateway file gateway NFS file share that is attached to an Amazon S3 bucket. Mount the NFS file share on each EC2 instance in the cluster.
- B. Provision a new Amazon Elastic File System (Amazon EFS) file system that uses General Purpose performance mode. Mount the EFS file system on each EC2 instance in the cluster.
- C. Provision a new Amazon Elastic Block Store (Amazon EBS) volume that uses the io2 volume type. Attach the EBS volume to all of the EC2 instances in the cluster.
- D. Provision a new Amazon Elastic File System (Amazon EFS) file system that uses Max I/O performance mode. Mount the EFS file system on each EC2 instance in the cluster.
Correct answer: D
Explanation
Amazon EFS is a POSIX-compliant, highly available, and resilient shared file system that can be accessed concurrently by EC2 instances across multiple Availability Zones. For big data analytics workloads that require highly parallelized operations and high aggregate throughput, EFS with Max I/O performance mode is the optimal choice over General Purpose mode. Amazon EBS io2 volumes with Multi-Attach do not support access across multiple Availability Zones, and AWS Storage Gateway is not designed for high-throughput big data analytics cluster storage.