AWS Certified Data Analytics – Specialty — Question 107
A hospital is building a research data lake to ingest data from electronic health records (EHR) systems from multiple hospitals and clinics. The EHR systems are independent of each other and do not have a common patient identifier. The data engineering team is not experienced in machine learning (ML) and has been asked to generate a unique patient identifier for the ingested records.
Which solution will accomplish this task?
Answer options
- A. An AWS Glue ETL job with the FindMatches transform
- B. Amazon Kendra
- C. Amazon SageMaker Ground Truth
- D. An AWS Glue ETL job with the ResolveChoice transform
Correct answer: A
Explanation
The correct answer is A, as the FindMatches transform in AWS Glue ETL jobs is specifically designed to identify and create unique identifiers for records based on matching criteria. Options B and C do not focus on generating unique identifiers from EHR data, while option D is intended for resolving conflicting data rather than identifying unique records.