Google Cloud Professional Data Engineer — Question 61
You need to deploy additional dependencies to all nodes of a Cloud Dataproc cluster at startup using an existing initialization action. Company security policies require that Cloud Dataproc nodes do not have access to the Internet so public initialization actions cannot fetch resources. What should you do?
Answer options
- A. Deploy the Cloud SQL Proxy on the Cloud Dataproc master
- B. Use an SSH tunnel to give the Cloud Dataproc cluster access to the Internet
- C. Copy all dependencies to a Cloud Storage bucket within your VPC security perimeter
- D. Use Resource Manager to add the service account used by the Cloud Dataproc cluster to the Network User role
Correct answer: C
Explanation
The correct answer is C because copying dependencies to a Cloud Storage bucket within your VPC security perimeter allows the nodes to access them without needing Internet connectivity. Option A is incorrect as deploying the Cloud SQL Proxy does not address the need for dependencies. Option B is not suitable since the requirement is to avoid Internet access. Option D does not solve the problem of making dependencies available at startup.