AWS Certified Big Data – Specialty — Question 20

Company A operates in Country X. Company A maintains a large dataset of historical purchase orders that contains personal data of their customers in the form of full names and telephone numbers. The dataset consists of 5 text files, 1TB each. Currently the dataset resides on-premises due to legal requirements of storing personal data in-country. The research and development department needs to run a clustering algorithm on the dataset and wants to use Elastic Map Reduce service in the closest AWS region. Due to geographic distance, the minimum latency between the on-premises system and the closet AWS region is 200 ms.
Which option allows Company A to do clustering in the AWS Cloud and meet the legal requirement of maintaining personal data in-country?

Answer options

Correct answer: B

Explanation

Option B is correct because establishing a Direct Connect link allows for a secure and reliable connection between the on-premises system and AWS, enabling the EMR cluster to read data without violating legal restrictions. Options A and C involve transferring or modifying personal data, which may not comply with local laws regarding data residency. Option D, while providing a secure transfer method, does not address the real-time access requirement needed for clustering operations.