Google Cloud Professional Data Engineer — Question 87
You are developing an application that uses a recommendation engine on Google Cloud. Your solution should display new videos to customers based on past views. Your solution needs to generate labels for the entities in videos that the customer has viewed. Your design must be able to provide very fast filtering suggestions based on data from other customer preferences on several TB of data. What should you do?
Answer options
- A. Build and train a complex classification model with Spark MLlib to generate labels and filter the results. Deploy the models using Cloud Dataproc. Call the model from your application.
- B. Build and train a classification model with Spark MLlib to generate labels. Build and train a second classification model with Spark MLlib to filter results to match customer preferences. Deploy the models using Cloud Dataproc. Call the models from your application.
- C. Build an application that calls the Cloud Video Intelligence API to generate labels. Store data in Cloud Bigtable, and filter the predicted labels to match the user's viewing history to generate preferences.
- D. Build an application that calls the Cloud Video Intelligence API to generate labels. Store data in Cloud SQL, and join and filter the predicted labels to match the user's viewing history to generate preferences.
Correct answer: C
Explanation
The correct answer is C because using the Cloud Video Intelligence API provides a highly efficient way to generate labels for video content, while Cloud Bigtable is optimized for handling large datasets and fast filtering. Options A and B involve more complex model training that may not be necessary, and option D, while feasible, uses Cloud SQL which may not scale as efficiently as Cloud Bigtable for large volumes of data.