AWS Certified Data Analytics – Specialty — Question 21

A company is streaming its high-volume billing data (100 MBps) to Amazon Kinesis Data Streams. A data analyst partitioned the data on account_id to ensure that all records belonging to an account go to the same Kinesis shard and order is maintained. While building a custom consumer using the Kinesis Java SDK, the data analyst notices that, sometimes, the messages arrive out of order for account_id. Upon further investigation, the data analyst discovers the messages that are out of order seem to be arriving from different shards for the same account_id and are seen when a stream resize runs.
What is an explanation for this behavior and what is the solution?

Answer options

Correct answer: D

Explanation

The correct answer is D because after a stream resize, Kinesis creates new shards, and if the consumer processes the child shards before completing the parent shard, it may lead to out-of-order records. Options A and B are incorrect because a single shard would not scale for high volume, and hash key generation issues wouldn't affect order during a resize. Option C is wrong as it misidentifies the problem; using PutRecords does not address the core issue of shard processing order.