Google Cloud Professional Data Engineer — Question 225
Your team is building a data lake platform on Google Cloud. As a part of the data foundation design, you are planning to store all the raw data in Cloud Storage. You are expecting to ingest approximately 25 GB of data a day and your billing department is worried about the increasing cost of storing old data. The current business requirements are:
• The old data can be deleted anytime.
• There is no predefined access pattern of the old data.
• The old data should be available instantly when accessed.
• There should not be any charges for data retrieval.
What should you do to optimize for cost?
Answer options
- A. Create the bucket with the Autoclass storage class feature.
- B. Create an Object Lifecycle Management policy to modify the storage class for data older than 30 days to nearline, 90 days to coldline, and 365 days to archive storage class. Delete old data as needed.
- C. Create an Object Lifecycle Management policy to modify the storage class for data older than 30 days to coldline, 90 days to nearline, and 365 days to archive storage class. Delete old data as needed.
- D. Create an Object Lifecycle Management policy to modify the storage class for data older than 30 days to nearline, 45 days to coldline, and 60 days to archive storage class. Delete old data as needed.
Correct answer: A
Explanation
The correct answer is A because using the Autoclass storage class feature allows Cloud Storage to automatically move data to the most cost-effective storage class based on access patterns, which aligns with the requirement for no charges on retrieval. The other options involve setting specific lifecycle policies that may incur costs for data retrieval and do not guarantee the instant availability of data as required.