Google Cloud Professional Data Engineer — Question 335

You are designing a cloud-native historical data processing system to meet the following conditions:
✑ The data being analyzed is in CSV, Avro, and PDF formats and will be accessed by multiple analysis tools including Dataproc, BigQuery, and Compute
Engine.
✑ A batch pipeline moves daily data.
✑ Performance is not a factor in the solution.
✑ The solution design should maximize availability.
How should you design data storage for this solution?

Answer options

Correct answer: D

Explanation

The correct answer is D because a multi-regional Cloud Storage bucket ensures high availability and redundancy across multiple locations, which aligns with the design goals. Options A and B do not maximize availability to the same extent, as HDFS and a single-region bucket can introduce risks of downtime. Option C, while functional, does not provide the same level of availability as a multi-regional solution.