AWS Certified Data Engineer – Associate (DEA-C01) — Question 201
A company has an on-premises PostgreSQL database that contains customer data. The company wants to migrate the customer data to an Amazon Redshift data warehouse. The company has established a VPN connection between the on-premises database and AWS.
The on-premises database is continuously updated. The company must ensure that the data in Amazon Redshift is updated as quickly as possible.
Which solution will meet these requirements?
Answer options
- A. Use the pg_dump utility to generate a backup of the PostgreSQL database. Use the AWS Schema Conversion Tool (AWS SCT) to upload the backup to Amazon Redshift. Set up a cron job to perform a backup. Upload the backup to Amazon Redshift every night.
- B. Create an AWS Database Migration Service (AWS DMS) full-load task. Set Amazon Redshift as the target. Configure the task to use the change data capture (CDC) feature.
- C. Use the pg_dump utility to generate a backup of the PostgreSQL database. Upload the backup to an Amazon S3 bucket. Use the COPY command to import the data into Amazon Redshift.
- D. Create an AWS Database Migration Service (AWS DMS) full-load task. Set Amazon Redshift as the target. Configure the task to perform a full load of the database to Amazon Redshift every night.
Correct answer: B
Explanation
Option B is correct as it uses AWS DMS with change data capture (CDC), allowing real-time updates from the on-premises PostgreSQL database to Amazon Redshift. Option A and C do not provide continuous updates, as they rely on scheduled backups, which can lead to data staleness. Option D also fails to meet the requirement for immediate updates since it performs a full load nightly, which does not capture ongoing changes.