AWS Certified Machine Learning – Specialty — Question 123

A company ingests machine learning (ML) data from web advertising clicks into an Amazon S3 data lake. Click data is added to an Amazon Kinesis data stream by using the Kinesis Producer Library (KPL). The data is loaded into the S3 data lake from the data stream by using an Amazon Kinesis Data Firehose delivery stream. As the data volume increases, an ML specialist notices that the rate of data ingested into Amazon S3 is relatively constant. There also is an increasing backlog of data for Kinesis Data Streams and Kinesis Data Firehose to ingest.
Which next step is MOST likely to improve the data ingestion rate into Amazon S3?

Answer options

Correct answer: C

Explanation

Increasing the number of shards for the data stream (option C) allows for greater parallelism and can significantly enhance the data ingestion rate, as each shard can handle its own throughput. The other options do not directly address the limitation in ingestion capacity; for example, decreasing the retention period (option B) does not speed up ingestion, and adding more consumers (option D) may not help if the stream itself is the bottleneck.