You have already completed the Test before. Hence you can not start it again.
Test is loading...
You must sign in or sign up to start the Test.
You have to finish following quiz, to start this Test:
Your results are here!! for" Google Professional Cloud DevOps Engineer Practice Test 5 "
0 of 65 questions answered correctly
Your time:
Time has elapsed
Your Final Score is : 0
You have attempted : 0
Number of Correct Questions : 0 and scored 0
Number of Incorrect Questions : 0 and Negative marks 0
Average score
Your score
Google Professional Cloud DevOps Engineer
You have attempted: 0
Number of Correct Questions: 0 and scored 0
Number of Incorrect Questions: 0 and Negative marks 0
You can review your answers by clicking on “View Answers” option. Important Note : Open Reference Documentation Links in New Tab (Right Click and Open in New Tab).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Answered
Review
Question 1 of 65
1. Question
You are part of the SRE team working for a data-processing company. Your team manages an application that was manually deployed to App Engine. The application source code is stored in Cloud Source Repositories. A new version of the application has been developed and tested. Approval has been given to deploy to production. You pushed the code update to the Cloud Source Repository, and after some time you notice the old version has not been updated. What should you do?
Correct
Options A and D is incorrect. The application was deployed manually so Cloud Build is not involved. Option B is incorrect. gcloud app browse is used to verify the app is running gcloud app deploy app.yaml, will deploy the new version of the application to App Engine. Reference https://cloud.google.com/source-repositories/docs/integrating-with-app-engine
Incorrect
Options A and D is incorrect. The application was deployed manually so Cloud Build is not involved. Option B is incorrect. gcloud app browse is used to verify the app is running gcloud app deploy app.yaml, will deploy the new version of the application to App Engine. Reference https://cloud.google.com/source-repositories/docs/integrating-with-app-engine
Unattempted
Options A and D is incorrect. The application was deployed manually so Cloud Build is not involved. Option B is incorrect. gcloud app browse is used to verify the app is running gcloud app deploy app.yaml, will deploy the new version of the application to App Engine. Reference https://cloud.google.com/source-repositories/docs/integrating-with-app-engine
Question 2 of 65
2. Question
the load testing of the application there were application failures, infrastructure issues, and some capacity issues which were resolved and documented for reference in future incidents. Which of the following is not a recommended practice for Incident management?
Correct
The correct answer is “Prioritize root-cause analysis during incidents.“ Google SRE‘s best practices for Incident Management prioritize service restoration during an incident.. Additionally, they recommend documenting incident management procedures in advance. Therefore, the incorrect answer is “Prioritize root-cause analysis during incidents.“ This answer goes against Google SRE‘s best practices for Incident Management. References: https://sre.google/sre-book/managing-incidents/ “Focus on restoring service during incidents“ – This is a recommended practice for Incident Management as it is in line with Google SRE‘s best practices for Incident Management which prioritize service restoration during an incident. “Encourage team members to be familiar with each role in the incident management process.“ – This is a recommended practice for Incident Management as it is in line with Google SRE‘s best practices for Incident Management which encourage team members to be familiar with each role in the process. “Develop and document your incident management procedures in advance.“ – This is a recommended practice for Incident Management as it is in line with Google SRE‘s best practices for Incident Management which recommend documenting incident management procedures in advance. “Prioritize root-cause analysis during incidents.“ – This is not a recommended practice for Incident Management as it goes against Google SRE‘s best practices for Incident Management which prioritize service restoration during an incident.
Incorrect
The correct answer is “Prioritize root-cause analysis during incidents.“ Google SRE‘s best practices for Incident Management prioritize service restoration during an incident.. Additionally, they recommend documenting incident management procedures in advance. Therefore, the incorrect answer is “Prioritize root-cause analysis during incidents.“ This answer goes against Google SRE‘s best practices for Incident Management. References: https://sre.google/sre-book/managing-incidents/ “Focus on restoring service during incidents“ – This is a recommended practice for Incident Management as it is in line with Google SRE‘s best practices for Incident Management which prioritize service restoration during an incident. “Encourage team members to be familiar with each role in the incident management process.“ – This is a recommended practice for Incident Management as it is in line with Google SRE‘s best practices for Incident Management which encourage team members to be familiar with each role in the process. “Develop and document your incident management procedures in advance.“ – This is a recommended practice for Incident Management as it is in line with Google SRE‘s best practices for Incident Management which recommend documenting incident management procedures in advance. “Prioritize root-cause analysis during incidents.“ – This is not a recommended practice for Incident Management as it goes against Google SRE‘s best practices for Incident Management which prioritize service restoration during an incident.
Unattempted
The correct answer is “Prioritize root-cause analysis during incidents.“ Google SRE‘s best practices for Incident Management prioritize service restoration during an incident.. Additionally, they recommend documenting incident management procedures in advance. Therefore, the incorrect answer is “Prioritize root-cause analysis during incidents.“ This answer goes against Google SRE‘s best practices for Incident Management. References: https://sre.google/sre-book/managing-incidents/ “Focus on restoring service during incidents“ – This is a recommended practice for Incident Management as it is in line with Google SRE‘s best practices for Incident Management which prioritize service restoration during an incident. “Encourage team members to be familiar with each role in the incident management process.“ – This is a recommended practice for Incident Management as it is in line with Google SRE‘s best practices for Incident Management which encourage team members to be familiar with each role in the process. “Develop and document your incident management procedures in advance.“ – This is a recommended practice for Incident Management as it is in line with Google SRE‘s best practices for Incident Management which recommend documenting incident management procedures in advance. “Prioritize root-cause analysis during incidents.“ – This is not a recommended practice for Incident Management as it goes against Google SRE‘s best practices for Incident Management which prioritize service restoration during an incident.
Question 3 of 65
3. Question
Your team has developed and tested a video processing service for your company. The video service accepts videos in one format and converts it to another specified format. Your team has agreed on the indicator metrics to track the performance of the system. All stakeholders of the application have agreed on a minimum target value, within a rolling 4-week window, for the indicator metric used to measure the service. What is needed to guarantee a level of service to the customer with consequences for missing it?
Correct
Options A, B and C are incorrect. Only the Service Level Agreements have consequences for not meeting Service Level Objectives. Option D is CORRECT. An SLA is needed Reference https://sre.google/sre-book/service-level-objectives/ (Agreements)
Incorrect
Options A, B and C are incorrect. Only the Service Level Agreements have consequences for not meeting Service Level Objectives. Option D is CORRECT. An SLA is needed Reference https://sre.google/sre-book/service-level-objectives/ (Agreements)
Unattempted
Options A, B and C are incorrect. Only the Service Level Agreements have consequences for not meeting Service Level Objectives. Option D is CORRECT. An SLA is needed Reference https://sre.google/sre-book/service-level-objectives/ (Agreements)
Question 4 of 65
4. Question
Your team has been tasked with the monitoring of a new application to be deployed on Managed Instance Groups. You are responsible for setting up the monitoring agent and the custom metrics for the application. You have chosen to create the metric descriptor manually. You need to monitor the memory utilization metric of the application and create an alerting policy. What value should the Metric Kind be set to in the descriptor?
Your team is building an automated CICD pipeline in the development Project. The Cloud Source Repository will be used for code versioning, Cloud Build will be used to build and deploy the application to Google Kubernetes Engine. The Cloud Build Service account has been given the Kubernetes Engine Developer permissions. After a developer pushes code to the Cloud Source Repository, you notice the application is not getting deployed to GKE. Which of the following could be the reason?
Correct
Option A is incorrect. The Kubernetes Engine Developer permission is sufficient for Cloud Build to deploy to GKE. Option B is CORRECT. For Cloud Build to automatically build code pushed to Cloud Source Repository, triggers must be created. Options C and D are incorrect. Cloud Source Repository does not need permissions and the GKE API will be enabled by the Cloud Build service before deployment. Reference https://cloud.google.com/build/docs/automating-builds/create-manage-triggers
Incorrect
Option A is incorrect. The Kubernetes Engine Developer permission is sufficient for Cloud Build to deploy to GKE. Option B is CORRECT. For Cloud Build to automatically build code pushed to Cloud Source Repository, triggers must be created. Options C and D are incorrect. Cloud Source Repository does not need permissions and the GKE API will be enabled by the Cloud Build service before deployment. Reference https://cloud.google.com/build/docs/automating-builds/create-manage-triggers
Unattempted
Option A is incorrect. The Kubernetes Engine Developer permission is sufficient for Cloud Build to deploy to GKE. Option B is CORRECT. For Cloud Build to automatically build code pushed to Cloud Source Repository, triggers must be created. Options C and D are incorrect. Cloud Source Repository does not need permissions and the GKE API will be enabled by the Cloud Build service before deployment. Reference https://cloud.google.com/build/docs/automating-builds/create-manage-triggers
Question 6 of 65
6. Question
Your organization has created three monitoring workspaces called dev-workspace, test-workspace and prod-workspace. The workspaces monitor the Projects outlined below: dev-workspace: dev-1, dev-2, dev-3 test-workspace: test-1, test-2 prod-workspace: prod-1, prod-b and prod-c You have been asked to monitor the project prod-1 alongside test-1 and test-2 in the same workspace. How will you achieve this?
Correct
A project can only be monitored by one workspace at any time. Options B, C and D are incorrect. Monitoring workspaces of Projects can be updated after creation. Merging both workspaces will mean five projects being monitored in test-workspace instead of three. Reference https://cloud.google.com/monitoring/workspaces/manage
Incorrect
A project can only be monitored by one workspace at any time. Options B, C and D are incorrect. Monitoring workspaces of Projects can be updated after creation. Merging both workspaces will mean five projects being monitored in test-workspace instead of three. Reference https://cloud.google.com/monitoring/workspaces/manage
Unattempted
A project can only be monitored by one workspace at any time. Options B, C and D are incorrect. Monitoring workspaces of Projects can be updated after creation. Merging both workspaces will mean five projects being monitored in test-workspace instead of three. Reference https://cloud.google.com/monitoring/workspaces/manage
Question 7 of 65
7. Question
You are designing an online gaming application. The web application allows users to select games and view leaderboards. Game scores are stored in a database. You want to identify the minimum Service Level Indicators (SLIs) for the application to ensure the leader board has the latest scores. What SLIs should you select?
Correct
This is the best SLI for the application. The web application is request-driven for which latency and availability are good SLIs. The database is a storage system, the recommended SLIs are latency, availability and durability. Options A, C & D are incorrect. Durability and Coverage are not suitable SLIs for the web application which is request-driven. Coverage is a suitable SLI for batch processing systems, so it is not suitable for the database of the gaming application. Reference https://sre.google/sre-book/service-level-objectives/ (Indicator in Practice)
Incorrect
This is the best SLI for the application. The web application is request-driven for which latency and availability are good SLIs. The database is a storage system, the recommended SLIs are latency, availability and durability. Options A, C & D are incorrect. Durability and Coverage are not suitable SLIs for the web application which is request-driven. Coverage is a suitable SLI for batch processing systems, so it is not suitable for the database of the gaming application. Reference https://sre.google/sre-book/service-level-objectives/ (Indicator in Practice)
Unattempted
This is the best SLI for the application. The web application is request-driven for which latency and availability are good SLIs. The database is a storage system, the recommended SLIs are latency, availability and durability. Options A, C & D are incorrect. Durability and Coverage are not suitable SLIs for the web application which is request-driven. Coverage is a suitable SLI for batch processing systems, so it is not suitable for the database of the gaming application. Reference https://sre.google/sre-book/service-level-objectives/ (Indicator in Practice)
Question 8 of 65
8. Question
Your company has multiple Project in its Google Cloud Organization hierarchy. There are resources in the different Projects which have been configured to send metrics to centralised monitoring workspace. You recently created a deployed Apache on a Compute Engine with a custom Service Account in one of the Projects. You installed and configured the monitoring agent to get metrics from Apache application. You notice there are no apache metrics in the centralised monitoring workspace. Which of the following is a possible reason?
Correct
FluentD is used for logging not monitoring. The agent is running because the question says it was installed and configured. The region of a Compute Engine has no effect on the monitoring workspace. This is a possible option where metrics are not showing up in the workspace. Reference https://cloud.google.com/monitoring/agent/monitoring/troubleshooting#verify-creds
Incorrect
FluentD is used for logging not monitoring. The agent is running because the question says it was installed and configured. The region of a Compute Engine has no effect on the monitoring workspace. This is a possible option where metrics are not showing up in the workspace. Reference https://cloud.google.com/monitoring/agent/monitoring/troubleshooting#verify-creds
Unattempted
FluentD is used for logging not monitoring. The agent is running because the question says it was installed and configured. The region of a Compute Engine has no effect on the monitoring workspace. This is a possible option where metrics are not showing up in the workspace. Reference https://cloud.google.com/monitoring/agent/monitoring/troubleshooting#verify-creds
Question 9 of 65
9. Question
You are responsible for setting up an automated CICD pipeline. The pipeline will be used to build docker images for application deployment to GKE. Recently the performance (build speed)) of Cloud Build in your pipeline has been dropping. What steps can you take to improve the speed of builds? (select 2)
Correct
Use .gcloudignore file to exclude unneeded files and selecting a higher machine type will speed up builds . Options B, C and E is incorrect. Larger base images will slow down builds and the Service Account permissions does not affect build speed. Reference https://cloud.google.com/build/docs/speeding-up-builds
Incorrect
Use .gcloudignore file to exclude unneeded files and selecting a higher machine type will speed up builds . Options B, C and E is incorrect. Larger base images will slow down builds and the Service Account permissions does not affect build speed. Reference https://cloud.google.com/build/docs/speeding-up-builds
Unattempted
Use .gcloudignore file to exclude unneeded files and selecting a higher machine type will speed up builds . Options B, C and E is incorrect. Larger base images will slow down builds and the Service Account permissions does not affect build speed. Reference https://cloud.google.com/build/docs/speeding-up-builds
Question 10 of 65
10. Question
Your team is developing a containerized python application for a government project. The application uses a microservices architecture and will be deployed using Cloud Run. You have been asked to capture the application‘s top or new errors in a clear dashboard in real-time. How would you achieve this?
Correct
The correct answer, is: “No additional setup or configuration is required. Error reporting is automatically enabled for Cloud Run.“ This is because Cloud Run is natively integrated with Error Reporting, so no additional setup is needed to capture the application‘s top or new errors in a clear dashboard in real time. All errors and exceptions in the application will be automatically logged to Error Reporting, which can be viewed and analyzed in a dashboard. References: ‘Cloud Run is automatically integrated with Error Reporting with no setup or configuration required.‘ -https://cloud.google.com/run/docs/error-reporting Here are explanations for the incorrect answers: “Install the Logging agent and modify your application, so it logs exceptions and their stack traces to Cloud Logging.“ – This answer is incorrect because Cloud Logging is a separate service from Error Reporting, and is not specifically designed for capturing errors and exceptions. While it is possible to log errors and exceptions to Cloud Logging, it is not the recommended approach for error reporting in the context of Cloud Run. “Install the Monitoring agent and modify your application, so it logs exceptions and their stack traces to Error reporting.“ – This answer is incorrect because the Monitoring agent is used for monitoring and collecting metrics, rather than for error reporting. While it is possible to configure the Monitoring agent to collect and report errors, it is not the most effective approach for error reporting in the context of Cloud Run. “Report errors to the API using either the REST API or a client library“ – This answer is incorrect because it does not describe a way to capture the application‘s top or new errors in a clear dashboard in real-time. While it is possible to report errors to the Error Reporting API using either the REST API or a client library, this would not provide a clear dashboard view of the errors and would not allow for real-time analysis.
Incorrect
The correct answer, is: “No additional setup or configuration is required. Error reporting is automatically enabled for Cloud Run.“ This is because Cloud Run is natively integrated with Error Reporting, so no additional setup is needed to capture the application‘s top or new errors in a clear dashboard in real time. All errors and exceptions in the application will be automatically logged to Error Reporting, which can be viewed and analyzed in a dashboard. References: ‘Cloud Run is automatically integrated with Error Reporting with no setup or configuration required.‘ -https://cloud.google.com/run/docs/error-reporting Here are explanations for the incorrect answers: “Install the Logging agent and modify your application, so it logs exceptions and their stack traces to Cloud Logging.“ – This answer is incorrect because Cloud Logging is a separate service from Error Reporting, and is not specifically designed for capturing errors and exceptions. While it is possible to log errors and exceptions to Cloud Logging, it is not the recommended approach for error reporting in the context of Cloud Run. “Install the Monitoring agent and modify your application, so it logs exceptions and their stack traces to Error reporting.“ – This answer is incorrect because the Monitoring agent is used for monitoring and collecting metrics, rather than for error reporting. While it is possible to configure the Monitoring agent to collect and report errors, it is not the most effective approach for error reporting in the context of Cloud Run. “Report errors to the API using either the REST API or a client library“ – This answer is incorrect because it does not describe a way to capture the application‘s top or new errors in a clear dashboard in real-time. While it is possible to report errors to the Error Reporting API using either the REST API or a client library, this would not provide a clear dashboard view of the errors and would not allow for real-time analysis.
Unattempted
The correct answer, is: “No additional setup or configuration is required. Error reporting is automatically enabled for Cloud Run.“ This is because Cloud Run is natively integrated with Error Reporting, so no additional setup is needed to capture the application‘s top or new errors in a clear dashboard in real time. All errors and exceptions in the application will be automatically logged to Error Reporting, which can be viewed and analyzed in a dashboard. References: ‘Cloud Run is automatically integrated with Error Reporting with no setup or configuration required.‘ -https://cloud.google.com/run/docs/error-reporting Here are explanations for the incorrect answers: “Install the Logging agent and modify your application, so it logs exceptions and their stack traces to Cloud Logging.“ – This answer is incorrect because Cloud Logging is a separate service from Error Reporting, and is not specifically designed for capturing errors and exceptions. While it is possible to log errors and exceptions to Cloud Logging, it is not the recommended approach for error reporting in the context of Cloud Run. “Install the Monitoring agent and modify your application, so it logs exceptions and their stack traces to Error reporting.“ – This answer is incorrect because the Monitoring agent is used for monitoring and collecting metrics, rather than for error reporting. While it is possible to configure the Monitoring agent to collect and report errors, it is not the most effective approach for error reporting in the context of Cloud Run. “Report errors to the API using either the REST API or a client library“ – This answer is incorrect because it does not describe a way to capture the application‘s top or new errors in a clear dashboard in real-time. While it is possible to report errors to the Error Reporting API using either the REST API or a client library, this would not provide a clear dashboard view of the errors and would not allow for real-time analysis.
Question 11 of 65
11. Question
Your organization has recently decided to move its applications to the Cloud. The current CICD pipeline uses GitHub repositories for source code version control. You have been directed to proof-of-concept deployment linking GitHub to Cloud Build for image creation and deployment. What steps can you take to achieve this with minimal overhead? (select 2)
Your SRE team is responsible for monitoring and logging of the applications in different Production Projects. The applications are deployed on different resources like Compute Engine and GKE. Your team has created a centralised monitoring dashboard in the monitoring Project for the metrics from all the production Projects. A new member needs to be given access to one of the charts in the centralised dashboard for training purposes. Which steps will help you meet the requirements? (select 2)
You are part of the DevOps team in a growing analytics company. The company currently deploys its docker applications on Virtual Machines on-premises. The company has three different environments: dev, staging and production. The company is planning to move its applications to GKE. The key requirement is the need to have the environments separate in a way the allows for restricting access using IAM policy. Which of the following helps you meet the requirement following GCP’s best practice?
Correct
Options A, B and C are incorrect. There is no way to manage the IAM permissions at a VPC level or Subnet level. While it is possible to apply RBAC using namespaces in a GKE Cluster to separate environments, this is not the best practice for separating environments. Option D is CORRECT. Best practice for managing environments and IAM policy is at the Project level. Reference https://cloud.google.com/docs/enterprise/best-practices-for-enterprise-organizations#project-structure
Incorrect
Options A, B and C are incorrect. There is no way to manage the IAM permissions at a VPC level or Subnet level. While it is possible to apply RBAC using namespaces in a GKE Cluster to separate environments, this is not the best practice for separating environments. Option D is CORRECT. Best practice for managing environments and IAM policy is at the Project level. Reference https://cloud.google.com/docs/enterprise/best-practices-for-enterprise-organizations#project-structure
Unattempted
Options A, B and C are incorrect. There is no way to manage the IAM permissions at a VPC level or Subnet level. While it is possible to apply RBAC using namespaces in a GKE Cluster to separate environments, this is not the best practice for separating environments. Option D is CORRECT. Best practice for managing environments and IAM policy is at the Project level. Reference https://cloud.google.com/docs/enterprise/best-practices-for-enterprise-organizations#project-structure
Question 14 of 65
14. Question
You are part of a team designing a containerized application to be deployed to GKE. The application will be deployed to a five-node cluster in a single region. The application will be used to process sensitive user data and there is a requirement to remove any sensitive data from the logs before it goes to Cloud Logging. Which of the following helps you meet the requirement? (select 2)
Correct
Options A, B and D are incorrect. System & workload logging does not allow you customise the logging. Legacy logging is deprecated and is not recommended for newer clusters. Deployments in Kubernetes will not guarantee the deployment of fluentd pods to all nodes in the cluster. Options C and E are CORRECT. Logging needs to be disabled so it can be installed manually and customized. Daemonset is the correct object used for logging because it ensures a fluentd pod is deployed on every node in the cluster to collect logs. Reference https://cloud.google.com/architecture/customizing-stackdriver-logs-fluentd
Incorrect
Options A, B and D are incorrect. System & workload logging does not allow you customise the logging. Legacy logging is deprecated and is not recommended for newer clusters. Deployments in Kubernetes will not guarantee the deployment of fluentd pods to all nodes in the cluster. Options C and E are CORRECT. Logging needs to be disabled so it can be installed manually and customized. Daemonset is the correct object used for logging because it ensures a fluentd pod is deployed on every node in the cluster to collect logs. Reference https://cloud.google.com/architecture/customizing-stackdriver-logs-fluentd
Unattempted
Options A, B and D are incorrect. System & workload logging does not allow you customise the logging. Legacy logging is deprecated and is not recommended for newer clusters. Deployments in Kubernetes will not guarantee the deployment of fluentd pods to all nodes in the cluster. Options C and E are CORRECT. Logging needs to be disabled so it can be installed manually and customized. Daemonset is the correct object used for logging because it ensures a fluentd pod is deployed on every node in the cluster to collect logs. Reference https://cloud.google.com/architecture/customizing-stackdriver-logs-fluentd
Question 15 of 65
15. Question
You are part of an on-call SRE team managing a frontend web service application in production. The application offers an HTTP-based API that consumers can use to manipulate various data. A new version has been developed and needs to be tested with live traffic. There is a requirement to minimize the number of users that will be affected if the new version fails. Which of the following helps you meet the requirement?
Correct
In canary deployment, you partially roll out a change to a subset of users and then evaluate its performance against a baseline deployment. Options B, C and D are incorrect. B, D represent the same technique. Blue/Red represents the current application version and green/black represents the new application version. Only one version is live at a time. These methods will affect every user if there is a failure. Option C will update the live application gradually until it is deployed to all instances. If there is a failure the affected users will increase as the deployment rolls out to all instances. References https://sre.google/workbook/canarying-releases/ https://cloud.google.com/architecture/application-deployment-and-testing-strategies
Incorrect
In canary deployment, you partially roll out a change to a subset of users and then evaluate its performance against a baseline deployment. Options B, C and D are incorrect. B, D represent the same technique. Blue/Red represents the current application version and green/black represents the new application version. Only one version is live at a time. These methods will affect every user if there is a failure. Option C will update the live application gradually until it is deployed to all instances. If there is a failure the affected users will increase as the deployment rolls out to all instances. References https://sre.google/workbook/canarying-releases/ https://cloud.google.com/architecture/application-deployment-and-testing-strategies
Unattempted
In canary deployment, you partially roll out a change to a subset of users and then evaluate its performance against a baseline deployment. Options B, C and D are incorrect. B, D represent the same technique. Blue/Red represents the current application version and green/black represents the new application version. Only one version is live at a time. These methods will affect every user if there is a failure. Option C will update the live application gradually until it is deployed to all instances. If there is a failure the affected users will increase as the deployment rolls out to all instances. References https://sre.google/workbook/canarying-releases/ https://cloud.google.com/architecture/application-deployment-and-testing-strategies
Question 16 of 65
16. Question
Your team is developing an application using Java. Cloud Build is used to build images for applications. There is a requirement for store the Java image and maven packages in GCP for use in deployment. What is the recommended solution to achieve this?
Correct
The Container Registry can hold images and packages can be stored in Cloud Storage. This is Google’s recommended approach. Options A, B and C is incorrect. Images cannot be stored in Cloud Source Repository, and packages cannot be stored in Container Registry. Reference https://cloud.google.com/build/docs/building/store-build-artifacts
Incorrect
The Container Registry can hold images and packages can be stored in Cloud Storage. This is Google’s recommended approach. Options A, B and C is incorrect. Images cannot be stored in Cloud Source Repository, and packages cannot be stored in Container Registry. Reference https://cloud.google.com/build/docs/building/store-build-artifacts
Unattempted
The Container Registry can hold images and packages can be stored in Cloud Storage. This is Google’s recommended approach. Options A, B and C is incorrect. Images cannot be stored in Cloud Source Repository, and packages cannot be stored in Container Registry. Reference https://cloud.google.com/build/docs/building/store-build-artifacts
Question 17 of 65
17. Question
You are part of an on-call SRE team managing a production application. The application receives requests, processes it and returns the response to the user. A new update was deployed yesterday to introduce new features into the application. Users are now complaining about errors and failed processed requests from the application. Your team declares an incident. Which of the following is the recommended first action after an incident is declared?
Correct
Options A, B and C are incorrect. Mitigating the impact is not the first action because you do not know the extent of the impact. Performing root-cause analysis and post-mortem is done after service is fully restored. Options D is CORRECT. This is the first recommended step. Assessing the impact or extent of the incident. Assessing the impact of the incident is an essential step in the incident response process. Here‘s why it is a recommended first action after declaring an incident: 1. Assessing the impact helps the incident response team understand the scope and severity of the incident. It involves gathering information about the affected systems, users, and any other components impacted by the incident. 2. By assessing the impact, the team can prioritize their efforts based on the severity of the incident. This enables them to allocate resources effectively and address the most critical issues first. 3. Understanding the impact provides valuable insights for decision-making. It helps in determining the appropriate course of action and setting realistic expectations for restoration and recovery. While mitigating the impact (Option A) is crucial, assessing the impact is typically the initial step taken to gain a comprehensive understanding of the incident‘s consequences. Performing a root-cause analysis (Option B) and fixing the cause (Option C) are important steps to prevent future incidents, but they usually come after assessing the impact and stabilizing the situation. Writing a post-mortem (Option C) is typically done after the incident is resolved, and a thorough analysis of the cause and impact has been conducted. Therefore, if Option D, “Assess the impact of the incident,“ is the correct option, it is recommended to assess the impact as the first action after declaring an incident. References: Google SRE Handbook: Incident Response: https://sre.google/sre-book/incident-response/ Incident Management: Best Practices for Incident Response: https://cloud.google.com/architecture/best-practices-for-incident-response
Incorrect
Options A, B and C are incorrect. Mitigating the impact is not the first action because you do not know the extent of the impact. Performing root-cause analysis and post-mortem is done after service is fully restored. Options D is CORRECT. This is the first recommended step. Assessing the impact or extent of the incident. Assessing the impact of the incident is an essential step in the incident response process. Here‘s why it is a recommended first action after declaring an incident: 1. Assessing the impact helps the incident response team understand the scope and severity of the incident. It involves gathering information about the affected systems, users, and any other components impacted by the incident. 2. By assessing the impact, the team can prioritize their efforts based on the severity of the incident. This enables them to allocate resources effectively and address the most critical issues first. 3. Understanding the impact provides valuable insights for decision-making. It helps in determining the appropriate course of action and setting realistic expectations for restoration and recovery. While mitigating the impact (Option A) is crucial, assessing the impact is typically the initial step taken to gain a comprehensive understanding of the incident‘s consequences. Performing a root-cause analysis (Option B) and fixing the cause (Option C) are important steps to prevent future incidents, but they usually come after assessing the impact and stabilizing the situation. Writing a post-mortem (Option C) is typically done after the incident is resolved, and a thorough analysis of the cause and impact has been conducted. Therefore, if Option D, “Assess the impact of the incident,“ is the correct option, it is recommended to assess the impact as the first action after declaring an incident. References: Google SRE Handbook: Incident Response: https://sre.google/sre-book/incident-response/ Incident Management: Best Practices for Incident Response: https://cloud.google.com/architecture/best-practices-for-incident-response
Unattempted
Options A, B and C are incorrect. Mitigating the impact is not the first action because you do not know the extent of the impact. Performing root-cause analysis and post-mortem is done after service is fully restored. Options D is CORRECT. This is the first recommended step. Assessing the impact or extent of the incident. Assessing the impact of the incident is an essential step in the incident response process. Here‘s why it is a recommended first action after declaring an incident: 1. Assessing the impact helps the incident response team understand the scope and severity of the incident. It involves gathering information about the affected systems, users, and any other components impacted by the incident. 2. By assessing the impact, the team can prioritize their efforts based on the severity of the incident. This enables them to allocate resources effectively and address the most critical issues first. 3. Understanding the impact provides valuable insights for decision-making. It helps in determining the appropriate course of action and setting realistic expectations for restoration and recovery. While mitigating the impact (Option A) is crucial, assessing the impact is typically the initial step taken to gain a comprehensive understanding of the incident‘s consequences. Performing a root-cause analysis (Option B) and fixing the cause (Option C) are important steps to prevent future incidents, but they usually come after assessing the impact and stabilizing the situation. Writing a post-mortem (Option C) is typically done after the incident is resolved, and a thorough analysis of the cause and impact has been conducted. Therefore, if Option D, “Assess the impact of the incident,“ is the correct option, it is recommended to assess the impact as the first action after declaring an incident. References: Google SRE Handbook: Incident Response: https://sre.google/sre-book/incident-response/ Incident Management: Best Practices for Incident Response: https://cloud.google.com/architecture/best-practices-for-incident-response
Question 18 of 65
18. Question
You are developing a new application for a global media company. The application will serve content to users in several countries. The application needs to have a high availability and reliability. Your team has agreed on relevant SLOs and Error budget policy with stakeholders. Which of the following is not a recommended action when the service has consumed its entire error budget?
Correct
Lowering the SLOs is not a recommended action when the Error budget is exhausted. Lowering the SLO means lowering the reliability of the system. Options B, C and D are incorrect. Halting releases to production and focusing on bugs that affect reliability are recommended actions when Error budget is exhausted. Reference https://sre.google/workbook/implementing-slos/ (Establishing an Error Budget)
Incorrect
Lowering the SLOs is not a recommended action when the Error budget is exhausted. Lowering the SLO means lowering the reliability of the system. Options B, C and D are incorrect. Halting releases to production and focusing on bugs that affect reliability are recommended actions when Error budget is exhausted. Reference https://sre.google/workbook/implementing-slos/ (Establishing an Error Budget)
Unattempted
Lowering the SLOs is not a recommended action when the Error budget is exhausted. Lowering the SLO means lowering the reliability of the system. Options B, C and D are incorrect. Halting releases to production and focusing on bugs that affect reliability are recommended actions when Error budget is exhausted. Reference https://sre.google/workbook/implementing-slos/ (Establishing an Error Budget)
Question 19 of 65
19. Question
You have created a new image of an application, without the image signature. You then try to deploy this image, instead of successfully deploying you receive an error “ Denied by Attestor”. What is the solution to resolve this problem?
Correct
The correct answer is “Create an attestation and submit to binary authorization“. Binary Authorization is a deploy time security service provided by Google that ensures that only trusted containers are deployed in our GKE cluster. It uses a policy-driven model that allows us to configure security policies. Behind the scenes, this service talks to the Container Analysis service. Attestation is a statement from the Attestor that an image is ready to be deployed. This attestation needs to be submitted properly, or the error will occur. There is a setup process required in the project that the cluster is hosted – enable the required APIs, create a Kubernetes cluster that has Binary Authorization enabled, set up a Note, generate the PGP keys and create an Attestor. Therefore, in order to resolve the error “Denied by Attestor”, the best course of action is to create an attestation and submit it to binary authorization. “Extract the signature of PGP with PUTTY“ – Extracting the signature of PGP with PUTTY is not related to the error “Denied by Attestor”. The error is caused by a lack of an attestation submitted to binary authorization, not an issue with the signature of PGP. “Contact Support since this is a Google issue“ – Contacting Support is unnecessary to resolve the error “Denied by Attestor”. The error is caused by a lack of an attestation submitted to binary authorization, which can be resolved by creating an attestation and submitting it to binary authorization. “Enable cloud build to use the proper permissions in IAM“ – Enabling cloud build with proper permissions in IAM is not related to the error “Denied by Attestor”. The error is caused by a lack of an attestation being submitted to binary authorization, not an issue with cloud build permissions in IAM.
Incorrect
The correct answer is “Create an attestation and submit to binary authorization“. Binary Authorization is a deploy time security service provided by Google that ensures that only trusted containers are deployed in our GKE cluster. It uses a policy-driven model that allows us to configure security policies. Behind the scenes, this service talks to the Container Analysis service. Attestation is a statement from the Attestor that an image is ready to be deployed. This attestation needs to be submitted properly, or the error will occur. There is a setup process required in the project that the cluster is hosted – enable the required APIs, create a Kubernetes cluster that has Binary Authorization enabled, set up a Note, generate the PGP keys and create an Attestor. Therefore, in order to resolve the error “Denied by Attestor”, the best course of action is to create an attestation and submit it to binary authorization. “Extract the signature of PGP with PUTTY“ – Extracting the signature of PGP with PUTTY is not related to the error “Denied by Attestor”. The error is caused by a lack of an attestation submitted to binary authorization, not an issue with the signature of PGP. “Contact Support since this is a Google issue“ – Contacting Support is unnecessary to resolve the error “Denied by Attestor”. The error is caused by a lack of an attestation submitted to binary authorization, which can be resolved by creating an attestation and submitting it to binary authorization. “Enable cloud build to use the proper permissions in IAM“ – Enabling cloud build with proper permissions in IAM is not related to the error “Denied by Attestor”. The error is caused by a lack of an attestation being submitted to binary authorization, not an issue with cloud build permissions in IAM.
Unattempted
The correct answer is “Create an attestation and submit to binary authorization“. Binary Authorization is a deploy time security service provided by Google that ensures that only trusted containers are deployed in our GKE cluster. It uses a policy-driven model that allows us to configure security policies. Behind the scenes, this service talks to the Container Analysis service. Attestation is a statement from the Attestor that an image is ready to be deployed. This attestation needs to be submitted properly, or the error will occur. There is a setup process required in the project that the cluster is hosted – enable the required APIs, create a Kubernetes cluster that has Binary Authorization enabled, set up a Note, generate the PGP keys and create an Attestor. Therefore, in order to resolve the error “Denied by Attestor”, the best course of action is to create an attestation and submit it to binary authorization. “Extract the signature of PGP with PUTTY“ – Extracting the signature of PGP with PUTTY is not related to the error “Denied by Attestor”. The error is caused by a lack of an attestation submitted to binary authorization, not an issue with the signature of PGP. “Contact Support since this is a Google issue“ – Contacting Support is unnecessary to resolve the error “Denied by Attestor”. The error is caused by a lack of an attestation submitted to binary authorization, which can be resolved by creating an attestation and submitting it to binary authorization. “Enable cloud build to use the proper permissions in IAM“ – Enabling cloud build with proper permissions in IAM is not related to the error “Denied by Attestor”. The error is caused by a lack of an attestation being submitted to binary authorization, not an issue with cloud build permissions in IAM.
Question 20 of 65
20. Question
You are managing an application that exposes an HTTP endpoint without using a load balancer. The latency of the HTTP responses is important for the user experience. You want to understand what HTTP latencies all of your users are experiencing. You use Stackdriver Monitoring. What should you do?
Correct
MetricKind and ValueType are two important parameters when creating a metric in Stackdriver Monitoring. MetricKind defines the type of metric being created, and ValueType defines the type of value the metric will report. The correct answer is: In your application, create a metric with a metricKind set to GAUGE and a valueType set to DISTRIBUTION. In Stackdriver‘s Metrics Explorer, use a Heatmap graph to visualize the metric. GAUGE is the most appropriate MetricKind for this use case, as it allows for the reporting of instantaneous values, which is important for understanding the latency of HTTP responses. DISTRIBUTION is the most appropriate ValueType for this use case, as it allows for the reporting of latency values in a range of buckets, which is important for understanding the latency experienced by all users. Heatmap is the most appropriate graph type for this use case, as it allows for the visualization of latency values across a range of buckets. “In your application, create a metric with a metricKind set to DELTA and a valueType set to DOUBLE. In Stackdriver‘s Metrics Explorer, use a Stacked Bar graph to visualize the metric.“ – Delta is not the most appropriate MetricKind for this use case, as it is used to report the rate of change of a metric, which is not important for understanding the latency of HTTP responses. Stacked Bar is not the most appropriate graph type for this use case, as it does not allow for visualising latency values across a range of buckets. “In your application, create a metric with a metricKind set to CUMULATIVE and a valueType set to DOUBLE. In Stackdriver‘s Metrics Explorer, use a Line graph to visualize the metric.“ – Cumulative is not the most appropriate MetricKind for this use case, as it is used to report the total amount of a metric, which is not important for understanding the latency of HTTP responses. Line is not the most appropriate graph type for this use case, as it does not allow for visualising latency values across a range of buckets. “In your application, create a metric with a metricKind set to METRIC_KIND_UNSPECIFIED and a valueType set to INT64. In Stackdriver‘s Metrics Explorer, use a Stacked Area graph to visualize the metric.“ – Metric_Kind_Unspecified is not the most appropriate MetricKind for this use case, as it is used to report an unspecified metric, which is not important for understanding the latency of HTTP responses. Stacked Area is not the most appropriate graph type for this use case, as it does not allow for visualising latency values across a range of buckets.
Incorrect
MetricKind and ValueType are two important parameters when creating a metric in Stackdriver Monitoring. MetricKind defines the type of metric being created, and ValueType defines the type of value the metric will report. The correct answer is: In your application, create a metric with a metricKind set to GAUGE and a valueType set to DISTRIBUTION. In Stackdriver‘s Metrics Explorer, use a Heatmap graph to visualize the metric. GAUGE is the most appropriate MetricKind for this use case, as it allows for the reporting of instantaneous values, which is important for understanding the latency of HTTP responses. DISTRIBUTION is the most appropriate ValueType for this use case, as it allows for the reporting of latency values in a range of buckets, which is important for understanding the latency experienced by all users. Heatmap is the most appropriate graph type for this use case, as it allows for the visualization of latency values across a range of buckets. “In your application, create a metric with a metricKind set to DELTA and a valueType set to DOUBLE. In Stackdriver‘s Metrics Explorer, use a Stacked Bar graph to visualize the metric.“ – Delta is not the most appropriate MetricKind for this use case, as it is used to report the rate of change of a metric, which is not important for understanding the latency of HTTP responses. Stacked Bar is not the most appropriate graph type for this use case, as it does not allow for visualising latency values across a range of buckets. “In your application, create a metric with a metricKind set to CUMULATIVE and a valueType set to DOUBLE. In Stackdriver‘s Metrics Explorer, use a Line graph to visualize the metric.“ – Cumulative is not the most appropriate MetricKind for this use case, as it is used to report the total amount of a metric, which is not important for understanding the latency of HTTP responses. Line is not the most appropriate graph type for this use case, as it does not allow for visualising latency values across a range of buckets. “In your application, create a metric with a metricKind set to METRIC_KIND_UNSPECIFIED and a valueType set to INT64. In Stackdriver‘s Metrics Explorer, use a Stacked Area graph to visualize the metric.“ – Metric_Kind_Unspecified is not the most appropriate MetricKind for this use case, as it is used to report an unspecified metric, which is not important for understanding the latency of HTTP responses. Stacked Area is not the most appropriate graph type for this use case, as it does not allow for visualising latency values across a range of buckets.
Unattempted
MetricKind and ValueType are two important parameters when creating a metric in Stackdriver Monitoring. MetricKind defines the type of metric being created, and ValueType defines the type of value the metric will report. The correct answer is: In your application, create a metric with a metricKind set to GAUGE and a valueType set to DISTRIBUTION. In Stackdriver‘s Metrics Explorer, use a Heatmap graph to visualize the metric. GAUGE is the most appropriate MetricKind for this use case, as it allows for the reporting of instantaneous values, which is important for understanding the latency of HTTP responses. DISTRIBUTION is the most appropriate ValueType for this use case, as it allows for the reporting of latency values in a range of buckets, which is important for understanding the latency experienced by all users. Heatmap is the most appropriate graph type for this use case, as it allows for the visualization of latency values across a range of buckets. “In your application, create a metric with a metricKind set to DELTA and a valueType set to DOUBLE. In Stackdriver‘s Metrics Explorer, use a Stacked Bar graph to visualize the metric.“ – Delta is not the most appropriate MetricKind for this use case, as it is used to report the rate of change of a metric, which is not important for understanding the latency of HTTP responses. Stacked Bar is not the most appropriate graph type for this use case, as it does not allow for visualising latency values across a range of buckets. “In your application, create a metric with a metricKind set to CUMULATIVE and a valueType set to DOUBLE. In Stackdriver‘s Metrics Explorer, use a Line graph to visualize the metric.“ – Cumulative is not the most appropriate MetricKind for this use case, as it is used to report the total amount of a metric, which is not important for understanding the latency of HTTP responses. Line is not the most appropriate graph type for this use case, as it does not allow for visualising latency values across a range of buckets. “In your application, create a metric with a metricKind set to METRIC_KIND_UNSPECIFIED and a valueType set to INT64. In Stackdriver‘s Metrics Explorer, use a Stacked Area graph to visualize the metric.“ – Metric_Kind_Unspecified is not the most appropriate MetricKind for this use case, as it is used to report an unspecified metric, which is not important for understanding the latency of HTTP responses. Stacked Area is not the most appropriate graph type for this use case, as it does not allow for visualising latency values across a range of buckets.
Question 21 of 65
21. Question
For several months, you must run a business-critical workload on a fixed set of Compute Engine instances. The workload is stable with the exact amount of resources allocated to it. You want to lower the costs for this workload without any performance implications. What should you do?
Correct
“Purchase Committed Use Discounts“ – This is the correct answer as Committed Use Discounts allow you to reserve Compute Engine resources for a sustained period of time at a discounted price, which would lower the costs for the workload without any performance implications. “Migrate the instances to a Managed Instance Group“ – This is incorrect as Managed Instance Groups are used to manage a group of identical instances that scale up and down in response to changes in demand. This would not be suitable for a workload that requires a fixed set of resources. “Convert the instances to preemptible virtual machines“ – This is incorrect as preemptible virtual machines are Compute Engine instances that are available at a much lower cost than regular instances, but with the caveat that they can be preempted (stopped) at any time. This would not be suitable for a business-critical workload that needs to run for several months. “Create an Unmanaged Instance Group for the instances used to run the workload“ – This is incorrect as Unmanaged Instance Groups are used to manage a group of identical instances that do not scale up and down in response to changes in demand. This would not be suitable for a workload that requires a fixed set of resources.
Incorrect
“Purchase Committed Use Discounts“ – This is the correct answer as Committed Use Discounts allow you to reserve Compute Engine resources for a sustained period of time at a discounted price, which would lower the costs for the workload without any performance implications. “Migrate the instances to a Managed Instance Group“ – This is incorrect as Managed Instance Groups are used to manage a group of identical instances that scale up and down in response to changes in demand. This would not be suitable for a workload that requires a fixed set of resources. “Convert the instances to preemptible virtual machines“ – This is incorrect as preemptible virtual machines are Compute Engine instances that are available at a much lower cost than regular instances, but with the caveat that they can be preempted (stopped) at any time. This would not be suitable for a business-critical workload that needs to run for several months. “Create an Unmanaged Instance Group for the instances used to run the workload“ – This is incorrect as Unmanaged Instance Groups are used to manage a group of identical instances that do not scale up and down in response to changes in demand. This would not be suitable for a workload that requires a fixed set of resources.
Unattempted
“Purchase Committed Use Discounts“ – This is the correct answer as Committed Use Discounts allow you to reserve Compute Engine resources for a sustained period of time at a discounted price, which would lower the costs for the workload without any performance implications. “Migrate the instances to a Managed Instance Group“ – This is incorrect as Managed Instance Groups are used to manage a group of identical instances that scale up and down in response to changes in demand. This would not be suitable for a workload that requires a fixed set of resources. “Convert the instances to preemptible virtual machines“ – This is incorrect as preemptible virtual machines are Compute Engine instances that are available at a much lower cost than regular instances, but with the caveat that they can be preempted (stopped) at any time. This would not be suitable for a business-critical workload that needs to run for several months. “Create an Unmanaged Instance Group for the instances used to run the workload“ – This is incorrect as Unmanaged Instance Groups are used to manage a group of identical instances that do not scale up and down in response to changes in demand. This would not be suitable for a workload that requires a fixed set of resources.
Question 22 of 65
22. Question
You encounter a large number of outages in the production systems you support. You receive alerts for all the outages that wake you up at night. The alerts are due to unhealthy systems that are automatically restarted within a minute. You want to set up a process that would prevent staff burnout while following Site Reliability Engineering practices. What should you do?
Correct
To prevent staff burnout, you should reduce the number of alerts sent to engineers. Those of us that have been called out multiple times per night will understand this becomes very tiring, very quickly. The alerts are for systems that are automatically restarted within a minute and do not require engineer intervention to resolve the problem. Alerts should not exist for these automatically fixed issues. References: ‘Eliminate bad monotoring‘ – https://cloud.google.com/blog/products/management-tools/meeting-reliability-challenges-with-sre-principles “Create an incident report for each of the alerts.“ – This is valid, anytime there is an outage to a production system, a postmortem and incident report should be conducted with a focus on how to avoid it from happening in future. However, the question is asking us to reduce staff burnout and creating an incident report will not reduce employee burnout in the short-medium term. “Distribute the alerts to engineers in different time zones.“ – Whilst this would reduce staff burnout, this course of action is not addressing the core issue, which is that an engineer does not need to be alerted of this incident if it is self-resolving. “Redefine the related Service Level Objective so that the error budget is not exhausted.“ – The question does not tell us the error budget so no judgements can be made based on it.
Incorrect
To prevent staff burnout, you should reduce the number of alerts sent to engineers. Those of us that have been called out multiple times per night will understand this becomes very tiring, very quickly. The alerts are for systems that are automatically restarted within a minute and do not require engineer intervention to resolve the problem. Alerts should not exist for these automatically fixed issues. References: ‘Eliminate bad monotoring‘ – https://cloud.google.com/blog/products/management-tools/meeting-reliability-challenges-with-sre-principles “Create an incident report for each of the alerts.“ – This is valid, anytime there is an outage to a production system, a postmortem and incident report should be conducted with a focus on how to avoid it from happening in future. However, the question is asking us to reduce staff burnout and creating an incident report will not reduce employee burnout in the short-medium term. “Distribute the alerts to engineers in different time zones.“ – Whilst this would reduce staff burnout, this course of action is not addressing the core issue, which is that an engineer does not need to be alerted of this incident if it is self-resolving. “Redefine the related Service Level Objective so that the error budget is not exhausted.“ – The question does not tell us the error budget so no judgements can be made based on it.
Unattempted
To prevent staff burnout, you should reduce the number of alerts sent to engineers. Those of us that have been called out multiple times per night will understand this becomes very tiring, very quickly. The alerts are for systems that are automatically restarted within a minute and do not require engineer intervention to resolve the problem. Alerts should not exist for these automatically fixed issues. References: ‘Eliminate bad monotoring‘ – https://cloud.google.com/blog/products/management-tools/meeting-reliability-challenges-with-sre-principles “Create an incident report for each of the alerts.“ – This is valid, anytime there is an outage to a production system, a postmortem and incident report should be conducted with a focus on how to avoid it from happening in future. However, the question is asking us to reduce staff burnout and creating an incident report will not reduce employee burnout in the short-medium term. “Distribute the alerts to engineers in different time zones.“ – Whilst this would reduce staff burnout, this course of action is not addressing the core issue, which is that an engineer does not need to be alerted of this incident if it is self-resolving. “Redefine the related Service Level Objective so that the error budget is not exhausted.“ – The question does not tell us the error budget so no judgements can be made based on it.
Question 23 of 65
23. Question
You use Cloud Build to build and deploy your application. You want to securely incorporate database credentials and other application secrets into the build pipeline. You also want to minimize the development effort. What should you do?
Correct
You wish to securely use the secrets and also want to minimise the development effort. This suggests that we use a managed service for these secrets (to Google, a managed service ‘minimises the development effort‘). The Cloud Key Management Service centrally manages encryption keys and allows them to be used in Cloud Build pipelines. References: https://cloud.google.com/build/docs/securing-builds/use-encrypted-credentials#configuring_builds_to_use_encrypted_data “Create a Cloud Storage bucket and use the built-in encryption at rest. Store the secrets in the bucket and grant Cloud Build access to the bucket.“ – This is not the simplest approach to take when Cloud KMS exists. “Encrypt the secrets and store them in the application repository. Store a decryption key in a separate repository and grant Cloud Build access to the repository.“ – Creating of custom tooling would more than likely be required for this approach. The question specifies that we wish to ‘minimise the development effort‘, as such using Cloud KMS would be a better and faster option. “Use client-side encryption to encrypt the secrets and store them in a Cloud Storage bucket. Store a decryption key in the bucket and grant Cloud Build access to the bucket.“ – Creating of custom tooling would more than likely be required for this approach. The question specifies that we wish to ‘minimise the development effort‘, as such using Cloud KMS would be a better and faster option
Incorrect
You wish to securely use the secrets and also want to minimise the development effort. This suggests that we use a managed service for these secrets (to Google, a managed service ‘minimises the development effort‘). The Cloud Key Management Service centrally manages encryption keys and allows them to be used in Cloud Build pipelines. References: https://cloud.google.com/build/docs/securing-builds/use-encrypted-credentials#configuring_builds_to_use_encrypted_data “Create a Cloud Storage bucket and use the built-in encryption at rest. Store the secrets in the bucket and grant Cloud Build access to the bucket.“ – This is not the simplest approach to take when Cloud KMS exists. “Encrypt the secrets and store them in the application repository. Store a decryption key in a separate repository and grant Cloud Build access to the repository.“ – Creating of custom tooling would more than likely be required for this approach. The question specifies that we wish to ‘minimise the development effort‘, as such using Cloud KMS would be a better and faster option. “Use client-side encryption to encrypt the secrets and store them in a Cloud Storage bucket. Store a decryption key in the bucket and grant Cloud Build access to the bucket.“ – Creating of custom tooling would more than likely be required for this approach. The question specifies that we wish to ‘minimise the development effort‘, as such using Cloud KMS would be a better and faster option
Unattempted
You wish to securely use the secrets and also want to minimise the development effort. This suggests that we use a managed service for these secrets (to Google, a managed service ‘minimises the development effort‘). The Cloud Key Management Service centrally manages encryption keys and allows them to be used in Cloud Build pipelines. References: https://cloud.google.com/build/docs/securing-builds/use-encrypted-credentials#configuring_builds_to_use_encrypted_data “Create a Cloud Storage bucket and use the built-in encryption at rest. Store the secrets in the bucket and grant Cloud Build access to the bucket.“ – This is not the simplest approach to take when Cloud KMS exists. “Encrypt the secrets and store them in the application repository. Store a decryption key in a separate repository and grant Cloud Build access to the repository.“ – Creating of custom tooling would more than likely be required for this approach. The question specifies that we wish to ‘minimise the development effort‘, as such using Cloud KMS would be a better and faster option. “Use client-side encryption to encrypt the secrets and store them in a Cloud Storage bucket. Store a decryption key in the bucket and grant Cloud Build access to the bucket.“ – Creating of custom tooling would more than likely be required for this approach. The question specifies that we wish to ‘minimise the development effort‘, as such using Cloud KMS would be a better and faster option
Question 24 of 65
24. Question
Your application artifacts are being built and deployed via a CI/CD pipeline. You want the CI/CD pipeline to securely access application secrets. You also want to more easily rotate secrets in case of a security breach. What should you do?
Correct
The term CI/CD pipeline refers to a set of automated processes that are used to build, test, and deploy software applications. Application secrets are pieces of information that are used to authenticate or authorize access to a system or application. To securely access application secrets, it is important to ensure that the secrets are stored in a secure manner and that access to the secrets is limited to only those who need it. “Store secrets in Cloud Storage encrypted with a key from Cloud KMS. Provide the CI/CD pipeline with access to Cloud KMS via IAM.“ – Storing secrets in Cloud Storage encrypted with a key from Cloud KMS is the most secure option as it ensures that the secrets are encrypted at rest and that access to the secrets is limited to only those who have been granted access via IAM. “Prompt developers for secrets at build time. Instruct developers to not store secrets at rest.“ – This option is not secure as it relies on developers to manually enter secrets at build time, which could lead to secrets being stored insecurely or exposed to unauthorized users. “Store secrets in a separate configuration file on Git. Provide select developers with access to the configuration file.“ – This option is not secure as it relies on developers to manually enter secrets at build time, which could lead to secrets being stored insecurely or exposed to unauthorized users. Additionally, storing secrets in a configuration file on Git does not provide any encryption or access control, making it vulnerable to unauthorized access. “Encrypt the secrets and store them in the source code repository. Store a decryption key in a separate repository and grant your pipeline access to it.“ – This option is not secure as it relies on developers to manually enter secrets at build time, which could lead to secrets being stored insecurely or exposed to unauthorized users. Additionally, storing the decryption key in a separate repository does not provide any encryption or access control, making it vulnerable to unauthorized access.
Incorrect
The term CI/CD pipeline refers to a set of automated processes that are used to build, test, and deploy software applications. Application secrets are pieces of information that are used to authenticate or authorize access to a system or application. To securely access application secrets, it is important to ensure that the secrets are stored in a secure manner and that access to the secrets is limited to only those who need it. “Store secrets in Cloud Storage encrypted with a key from Cloud KMS. Provide the CI/CD pipeline with access to Cloud KMS via IAM.“ – Storing secrets in Cloud Storage encrypted with a key from Cloud KMS is the most secure option as it ensures that the secrets are encrypted at rest and that access to the secrets is limited to only those who have been granted access via IAM. “Prompt developers for secrets at build time. Instruct developers to not store secrets at rest.“ – This option is not secure as it relies on developers to manually enter secrets at build time, which could lead to secrets being stored insecurely or exposed to unauthorized users. “Store secrets in a separate configuration file on Git. Provide select developers with access to the configuration file.“ – This option is not secure as it relies on developers to manually enter secrets at build time, which could lead to secrets being stored insecurely or exposed to unauthorized users. Additionally, storing secrets in a configuration file on Git does not provide any encryption or access control, making it vulnerable to unauthorized access. “Encrypt the secrets and store them in the source code repository. Store a decryption key in a separate repository and grant your pipeline access to it.“ – This option is not secure as it relies on developers to manually enter secrets at build time, which could lead to secrets being stored insecurely or exposed to unauthorized users. Additionally, storing the decryption key in a separate repository does not provide any encryption or access control, making it vulnerable to unauthorized access.
Unattempted
The term CI/CD pipeline refers to a set of automated processes that are used to build, test, and deploy software applications. Application secrets are pieces of information that are used to authenticate or authorize access to a system or application. To securely access application secrets, it is important to ensure that the secrets are stored in a secure manner and that access to the secrets is limited to only those who need it. “Store secrets in Cloud Storage encrypted with a key from Cloud KMS. Provide the CI/CD pipeline with access to Cloud KMS via IAM.“ – Storing secrets in Cloud Storage encrypted with a key from Cloud KMS is the most secure option as it ensures that the secrets are encrypted at rest and that access to the secrets is limited to only those who have been granted access via IAM. “Prompt developers for secrets at build time. Instruct developers to not store secrets at rest.“ – This option is not secure as it relies on developers to manually enter secrets at build time, which could lead to secrets being stored insecurely or exposed to unauthorized users. “Store secrets in a separate configuration file on Git. Provide select developers with access to the configuration file.“ – This option is not secure as it relies on developers to manually enter secrets at build time, which could lead to secrets being stored insecurely or exposed to unauthorized users. Additionally, storing secrets in a configuration file on Git does not provide any encryption or access control, making it vulnerable to unauthorized access. “Encrypt the secrets and store them in the source code repository. Store a decryption key in a separate repository and grant your pipeline access to it.“ – This option is not secure as it relies on developers to manually enter secrets at build time, which could lead to secrets being stored insecurely or exposed to unauthorized users. Additionally, storing the decryption key in a separate repository does not provide any encryption or access control, making it vulnerable to unauthorized access.
Question 25 of 65
25. Question
You support a high-traffic web application and want to ensure that the home page loads in a timely manner. As a first step, you decide to implement a Service Level Indicator (SLI) to represent home page request latency with an acceptable page load time set to 100 ms. What is the Google-recommended way of calculating this SLI?
Correct
C. Count the number of home page requests that load in under 100 ms, and then divide by the total number of home page requests. The recommended way to calculate an SLI for a particular service or application is to start by defining the service level objective (SLO) that the SLI is meant to measure. In this case, the SLO is a home page request latency of 100 ms or less. To calculate the SLI, we need to count the number of home page requests that meet this latency requirement and divide by the total number of home page requests. This gives us the proportion of requests that are meeting the latency requirement, which is the SLI. Option A is incorrect because computing the percentile at 100 ms would tell us the percentage of requests that are faster than 100 ms, but not the proportion that are meeting the 100 ms latency requirement. Option B is incorrect because computing the median and 90th percentiles would not give us a direct measure of the proportion of requests that are meeting the 100 ms latency requirement. Option D is incorrect because it includes all requests to the web application, not just the home page requests, which would not accurately reflect the performance of the home page.
Incorrect
C. Count the number of home page requests that load in under 100 ms, and then divide by the total number of home page requests. The recommended way to calculate an SLI for a particular service or application is to start by defining the service level objective (SLO) that the SLI is meant to measure. In this case, the SLO is a home page request latency of 100 ms or less. To calculate the SLI, we need to count the number of home page requests that meet this latency requirement and divide by the total number of home page requests. This gives us the proportion of requests that are meeting the latency requirement, which is the SLI. Option A is incorrect because computing the percentile at 100 ms would tell us the percentage of requests that are faster than 100 ms, but not the proportion that are meeting the 100 ms latency requirement. Option B is incorrect because computing the median and 90th percentiles would not give us a direct measure of the proportion of requests that are meeting the 100 ms latency requirement. Option D is incorrect because it includes all requests to the web application, not just the home page requests, which would not accurately reflect the performance of the home page.
Unattempted
C. Count the number of home page requests that load in under 100 ms, and then divide by the total number of home page requests. The recommended way to calculate an SLI for a particular service or application is to start by defining the service level objective (SLO) that the SLI is meant to measure. In this case, the SLO is a home page request latency of 100 ms or less. To calculate the SLI, we need to count the number of home page requests that meet this latency requirement and divide by the total number of home page requests. This gives us the proportion of requests that are meeting the latency requirement, which is the SLI. Option A is incorrect because computing the percentile at 100 ms would tell us the percentage of requests that are faster than 100 ms, but not the proportion that are meeting the 100 ms latency requirement. Option B is incorrect because computing the median and 90th percentiles would not give us a direct measure of the proportion of requests that are meeting the 100 ms latency requirement. Option D is incorrect because it includes all requests to the web application, not just the home page requests, which would not accurately reflect the performance of the home page.
Question 26 of 65
26. Question
In order to ensure security and enable easy troubleshooting of applications, a minimal amount of setup is needed for a pool of application servers running on Google Compute Engine. To facilitate this, developers must be able to access application logs quickly and effectively. What is the most suitable approach to implement this solution on GCP?
Correct
The scenario presented requires a minimal amount of setup for a pool of application servers running on Google Compute Engine. The developers must be able to access application logs quickly and effectively to ensure security and enable easy troubleshooting of applications. Option A is not the best answer because it involves installing the Stackdriver monitoring agent, which is primarily used to monitor system and infrastructure metrics rather than application logs. Also, the IAM Monitoring Viewer role is not suitable for viewing logs in Stackdriver. Option B is not the best answer because it involves installing the Stackdriver logging agent to the application servers and giving the developers the IAM Logs Private Logs Viewer role. This role allows viewing logs for private logs only and not public logs. This approach can limit the visibility of logs for developers and can lead to difficulty in troubleshooting. Option D is not the best answer because it involves installing the gsutil command-line tool on the application servers and scheduling a script to run via cron every five minutes to upload application logs to a Cloud Storage bucket. This approach is not suitable for real-time log access and can result in delayed troubleshooting. Also, it requires additional setup and maintenance. Option C is the best answer because it involves installing the Stackdriver logging agent to the application servers, which allows for the collection and storage of application logs in Stackdriver Logging. Additionally, giving developers the IAM Logs Viewer role allows them to access Stackdriver Logging and view logs quickly and effectively, ensuring easy troubleshooting and security. Reference: Google Cloud. (n.d.). Stackdriver Logging. Retrieved May 10, 2023, from https://cloud.google.com/logging Google Cloud. (n.d.). Google Cloud IAM. Retrieved May 10, 2023, from https://cloud.google.com/iam
Incorrect
The scenario presented requires a minimal amount of setup for a pool of application servers running on Google Compute Engine. The developers must be able to access application logs quickly and effectively to ensure security and enable easy troubleshooting of applications. Option A is not the best answer because it involves installing the Stackdriver monitoring agent, which is primarily used to monitor system and infrastructure metrics rather than application logs. Also, the IAM Monitoring Viewer role is not suitable for viewing logs in Stackdriver. Option B is not the best answer because it involves installing the Stackdriver logging agent to the application servers and giving the developers the IAM Logs Private Logs Viewer role. This role allows viewing logs for private logs only and not public logs. This approach can limit the visibility of logs for developers and can lead to difficulty in troubleshooting. Option D is not the best answer because it involves installing the gsutil command-line tool on the application servers and scheduling a script to run via cron every five minutes to upload application logs to a Cloud Storage bucket. This approach is not suitable for real-time log access and can result in delayed troubleshooting. Also, it requires additional setup and maintenance. Option C is the best answer because it involves installing the Stackdriver logging agent to the application servers, which allows for the collection and storage of application logs in Stackdriver Logging. Additionally, giving developers the IAM Logs Viewer role allows them to access Stackdriver Logging and view logs quickly and effectively, ensuring easy troubleshooting and security. Reference: Google Cloud. (n.d.). Stackdriver Logging. Retrieved May 10, 2023, from https://cloud.google.com/logging Google Cloud. (n.d.). Google Cloud IAM. Retrieved May 10, 2023, from https://cloud.google.com/iam
Unattempted
The scenario presented requires a minimal amount of setup for a pool of application servers running on Google Compute Engine. The developers must be able to access application logs quickly and effectively to ensure security and enable easy troubleshooting of applications. Option A is not the best answer because it involves installing the Stackdriver monitoring agent, which is primarily used to monitor system and infrastructure metrics rather than application logs. Also, the IAM Monitoring Viewer role is not suitable for viewing logs in Stackdriver. Option B is not the best answer because it involves installing the Stackdriver logging agent to the application servers and giving the developers the IAM Logs Private Logs Viewer role. This role allows viewing logs for private logs only and not public logs. This approach can limit the visibility of logs for developers and can lead to difficulty in troubleshooting. Option D is not the best answer because it involves installing the gsutil command-line tool on the application servers and scheduling a script to run via cron every five minutes to upload application logs to a Cloud Storage bucket. This approach is not suitable for real-time log access and can result in delayed troubleshooting. Also, it requires additional setup and maintenance. Option C is the best answer because it involves installing the Stackdriver logging agent to the application servers, which allows for the collection and storage of application logs in Stackdriver Logging. Additionally, giving developers the IAM Logs Viewer role allows them to access Stackdriver Logging and view logs quickly and effectively, ensuring easy troubleshooting and security. Reference: Google Cloud. (n.d.). Stackdriver Logging. Retrieved May 10, 2023, from https://cloud.google.com/logging Google Cloud. (n.d.). Google Cloud IAM. Retrieved May 10, 2023, from https://cloud.google.com/iam
Question 27 of 65
27. Question
As your company follows Site Reliability Engineering practices, you are tasked with Communications for a large and ongoing incident involving customer-facing applications. With no estimated time for a resolution, you are receiving emails from both internal stakeholders and customers who wish to know of the outage‘s status. To efficiently provide updates to all those affected, what should you do?
Correct
As a Site Reliability Engineer (SRE), your responsibility is to maintain the reliability and availability of your company‘s systems and services. In the event of an ongoing incident involving customer-facing applications, it is essential to communicate effectively with both internal stakeholders and customers. To efficiently provide updates to all those affected, the recommended approach is to give timely updates to all stakeholders and set a ‘next update‘ time in all communications. This approach ensures that everyone is aware of the status of the incident and when they can expect to receive the next update. It also sets expectations for the incident response team, ensuring they are accountable for delivering the updates at the specified times. In addition to setting ‘next update‘ times, it is essential to prioritize responses to customers over internal stakeholders. Customers are the ones directly impacted by the incident and need to be informed promptly about the situation. Responding to internal stakeholder emails every 30 minutes may cause delays in providing updates to customers, resulting in frustration and negative feedback. Handing over internal stakeholder emails to the Incident Commander may be an effective approach for larger incidents, where the number of stakeholders and the volume of emails is high. However, in smaller incidents, it is more efficient for the SRE to manage all communications. In summary, the recommended approach for efficiently providing updates to all those affected during an ongoing incident involving customer-facing applications is to give timely updates to all stakeholders, prioritize responses to customers over internal stakeholders, and set ‘next update‘ times in all communications. This approach helps to manage expectations, ensure accountability, and maintain effective communication during the incident response process. Reference: Google. (2016). Site Reliability Engineering: How Google Runs Production Systems. O‘Reilly Media, Inc.
Incorrect
As a Site Reliability Engineer (SRE), your responsibility is to maintain the reliability and availability of your company‘s systems and services. In the event of an ongoing incident involving customer-facing applications, it is essential to communicate effectively with both internal stakeholders and customers. To efficiently provide updates to all those affected, the recommended approach is to give timely updates to all stakeholders and set a ‘next update‘ time in all communications. This approach ensures that everyone is aware of the status of the incident and when they can expect to receive the next update. It also sets expectations for the incident response team, ensuring they are accountable for delivering the updates at the specified times. In addition to setting ‘next update‘ times, it is essential to prioritize responses to customers over internal stakeholders. Customers are the ones directly impacted by the incident and need to be informed promptly about the situation. Responding to internal stakeholder emails every 30 minutes may cause delays in providing updates to customers, resulting in frustration and negative feedback. Handing over internal stakeholder emails to the Incident Commander may be an effective approach for larger incidents, where the number of stakeholders and the volume of emails is high. However, in smaller incidents, it is more efficient for the SRE to manage all communications. In summary, the recommended approach for efficiently providing updates to all those affected during an ongoing incident involving customer-facing applications is to give timely updates to all stakeholders, prioritize responses to customers over internal stakeholders, and set ‘next update‘ times in all communications. This approach helps to manage expectations, ensure accountability, and maintain effective communication during the incident response process. Reference: Google. (2016). Site Reliability Engineering: How Google Runs Production Systems. O‘Reilly Media, Inc.
Unattempted
As a Site Reliability Engineer (SRE), your responsibility is to maintain the reliability and availability of your company‘s systems and services. In the event of an ongoing incident involving customer-facing applications, it is essential to communicate effectively with both internal stakeholders and customers. To efficiently provide updates to all those affected, the recommended approach is to give timely updates to all stakeholders and set a ‘next update‘ time in all communications. This approach ensures that everyone is aware of the status of the incident and when they can expect to receive the next update. It also sets expectations for the incident response team, ensuring they are accountable for delivering the updates at the specified times. In addition to setting ‘next update‘ times, it is essential to prioritize responses to customers over internal stakeholders. Customers are the ones directly impacted by the incident and need to be informed promptly about the situation. Responding to internal stakeholder emails every 30 minutes may cause delays in providing updates to customers, resulting in frustration and negative feedback. Handing over internal stakeholder emails to the Incident Commander may be an effective approach for larger incidents, where the number of stakeholders and the volume of emails is high. However, in smaller incidents, it is more efficient for the SRE to manage all communications. In summary, the recommended approach for efficiently providing updates to all those affected during an ongoing incident involving customer-facing applications is to give timely updates to all stakeholders, prioritize responses to customers over internal stakeholders, and set ‘next update‘ times in all communications. This approach helps to manage expectations, ensure accountability, and maintain effective communication during the incident response process. Reference: Google. (2016). Site Reliability Engineering: How Google Runs Production Systems. O‘Reilly Media, Inc.
Question 28 of 65
28. Question
You are running an application in a virtual machine (VM) using a custom Debian image. The image has the Stackdriver Logging agent installed. The VM has the cloud-platform scope. The application is logging information via syslog. You want to use Stackdriver Logging in the Google Cloud Platform Console to visualize the logs. You notice that syslog is not showing up in the “All logs“ dropdown list of the Logs Viewer. What is the first thing you should do?
Correct
The first thing you should do is: D. SSH to the VM and execute the following commands on your VM: ps ax | grep fluentd. The Stackdriver Logging agent is powered by fluentd, and it is responsible for collecting, processing, and forwarding the logs to the Stackdriver Logging service. By executing the command “ps ax | grep fluentd“ on the VM, you can verify if the fluentd process is running, which is necessary for the agent to collect and send the syslog data to Stackdriver Logging. If the fluentd process is not running, you can use the “systemctl start google-fluentd“ command to start the process manually. Once you have confirmed that the fluentd process is running, you can look for the agent‘s test log entry in the Logs Viewer to ensure that the logs are being sent to Stackdriver Logging correctly. References: Google Cloud Logging Agent Documentation: https://cloud.google.com/logging/docs/agent/ Troubleshooting Stackdriver Logging Agent: https://cloud.google.com/logging/docs/agent/troubleshooting
Incorrect
The first thing you should do is: D. SSH to the VM and execute the following commands on your VM: ps ax | grep fluentd. The Stackdriver Logging agent is powered by fluentd, and it is responsible for collecting, processing, and forwarding the logs to the Stackdriver Logging service. By executing the command “ps ax | grep fluentd“ on the VM, you can verify if the fluentd process is running, which is necessary for the agent to collect and send the syslog data to Stackdriver Logging. If the fluentd process is not running, you can use the “systemctl start google-fluentd“ command to start the process manually. Once you have confirmed that the fluentd process is running, you can look for the agent‘s test log entry in the Logs Viewer to ensure that the logs are being sent to Stackdriver Logging correctly. References: Google Cloud Logging Agent Documentation: https://cloud.google.com/logging/docs/agent/ Troubleshooting Stackdriver Logging Agent: https://cloud.google.com/logging/docs/agent/troubleshooting
Unattempted
The first thing you should do is: D. SSH to the VM and execute the following commands on your VM: ps ax | grep fluentd. The Stackdriver Logging agent is powered by fluentd, and it is responsible for collecting, processing, and forwarding the logs to the Stackdriver Logging service. By executing the command “ps ax | grep fluentd“ on the VM, you can verify if the fluentd process is running, which is necessary for the agent to collect and send the syslog data to Stackdriver Logging. If the fluentd process is not running, you can use the “systemctl start google-fluentd“ command to start the process manually. Once you have confirmed that the fluentd process is running, you can look for the agent‘s test log entry in the Logs Viewer to ensure that the logs are being sent to Stackdriver Logging correctly. References: Google Cloud Logging Agent Documentation: https://cloud.google.com/logging/docs/agent/ Troubleshooting Stackdriver Logging Agent: https://cloud.google.com/logging/docs/agent/troubleshooting
Question 29 of 65
29. Question
You have a CI/CD pipeline that uses Cloud Build to build new Docker images and push them to Docker Hub. You use Git for code versioning. After making a change in the Cloud Build YAML configuration, you notice that no new artifacts are being built by the pipeline. You need to resolve the issue following Site Reliability Engineering practices. What should you do?
Correct
To resolve the issue with the CI/CD pipeline not building new artifacts after making a change in the Cloud Build YAML configuration, following Site Reliability Engineering practices, the best option is: D. Run a Git compare between the previous and current Cloud Build Configuration files to find and fix the bug. By running a Git compare between the previous and current Cloud Build Configuration files, you can identify the changes that were made and locate any syntax or logic errors that might be preventing new artifacts from being built. This approach allows you to identify and fix the root cause of the issue, rather than simply disabling the pipeline or switching to a different artifact registry. Once the issue is resolved, the pipeline can resume its normal operation. Uploading the configuration YAML file to Cloud Storage and using Error Reporting could also be useful for identifying and fixing the issue, but it would be a secondary step after comparing the configuration files. References: Google Cloud Build Documentation: https://cloud.google.com/build/docs/ Site Reliability Engineering book: https://sre.google/sre-book/
Incorrect
To resolve the issue with the CI/CD pipeline not building new artifacts after making a change in the Cloud Build YAML configuration, following Site Reliability Engineering practices, the best option is: D. Run a Git compare between the previous and current Cloud Build Configuration files to find and fix the bug. By running a Git compare between the previous and current Cloud Build Configuration files, you can identify the changes that were made and locate any syntax or logic errors that might be preventing new artifacts from being built. This approach allows you to identify and fix the root cause of the issue, rather than simply disabling the pipeline or switching to a different artifact registry. Once the issue is resolved, the pipeline can resume its normal operation. Uploading the configuration YAML file to Cloud Storage and using Error Reporting could also be useful for identifying and fixing the issue, but it would be a secondary step after comparing the configuration files. References: Google Cloud Build Documentation: https://cloud.google.com/build/docs/ Site Reliability Engineering book: https://sre.google/sre-book/
Unattempted
To resolve the issue with the CI/CD pipeline not building new artifacts after making a change in the Cloud Build YAML configuration, following Site Reliability Engineering practices, the best option is: D. Run a Git compare between the previous and current Cloud Build Configuration files to find and fix the bug. By running a Git compare between the previous and current Cloud Build Configuration files, you can identify the changes that were made and locate any syntax or logic errors that might be preventing new artifacts from being built. This approach allows you to identify and fix the root cause of the issue, rather than simply disabling the pipeline or switching to a different artifact registry. Once the issue is resolved, the pipeline can resume its normal operation. Uploading the configuration YAML file to Cloud Storage and using Error Reporting could also be useful for identifying and fixing the issue, but it would be a secondary step after comparing the configuration files. References: Google Cloud Build Documentation: https://cloud.google.com/build/docs/ Site Reliability Engineering book: https://sre.google/sre-book/
Question 30 of 65
30. Question
You need to reduce the cost of virtual machines (VM) for your organization. After reviewing different options, you decide to leverage preemptible VM instances. Which application is suitable for preemptible VMs?
Correct
Preemptible VM instances are suitable for applications that are fault-tolerant and can tolerate interruptions, as these instances can be interrupted by the cloud provider at any time. This means that any workloads that are not critical and can be restarted or migrated easily can be run on preemptible VMs. Based on this, the application that is suitable for preemptible VMs is: A. A scalable in-memory caching system. This is because in-memory caching systems typically store frequently accessed data in memory to improve performance. As the data can be regenerated if lost, the system can tolerate interruptions and can recover quickly. Using preemptible VMs for this workload can help reduce costs while still providing the required level of performance. References: Google Cloud. (n.d.). Preemptible VMs. https://cloud.google.com/preemptible-vms Kaur, H. (2021, May 31). What are Preemptible Virtual Machines? How they differ from regular ones. https://www.interserver.net/blog/what-are-preemptible-virtual-machines/
Incorrect
Preemptible VM instances are suitable for applications that are fault-tolerant and can tolerate interruptions, as these instances can be interrupted by the cloud provider at any time. This means that any workloads that are not critical and can be restarted or migrated easily can be run on preemptible VMs. Based on this, the application that is suitable for preemptible VMs is: A. A scalable in-memory caching system. This is because in-memory caching systems typically store frequently accessed data in memory to improve performance. As the data can be regenerated if lost, the system can tolerate interruptions and can recover quickly. Using preemptible VMs for this workload can help reduce costs while still providing the required level of performance. References: Google Cloud. (n.d.). Preemptible VMs. https://cloud.google.com/preemptible-vms Kaur, H. (2021, May 31). What are Preemptible Virtual Machines? How they differ from regular ones. https://www.interserver.net/blog/what-are-preemptible-virtual-machines/
Unattempted
Preemptible VM instances are suitable for applications that are fault-tolerant and can tolerate interruptions, as these instances can be interrupted by the cloud provider at any time. This means that any workloads that are not critical and can be restarted or migrated easily can be run on preemptible VMs. Based on this, the application that is suitable for preemptible VMs is: A. A scalable in-memory caching system. This is because in-memory caching systems typically store frequently accessed data in memory to improve performance. As the data can be regenerated if lost, the system can tolerate interruptions and can recover quickly. Using preemptible VMs for this workload can help reduce costs while still providing the required level of performance. References: Google Cloud. (n.d.). Preemptible VMs. https://cloud.google.com/preemptible-vms Kaur, H. (2021, May 31). What are Preemptible Virtual Machines? How they differ from regular ones. https://www.interserver.net/blog/what-are-preemptible-virtual-machines/
Question 31 of 65
31. Question
You need to deploy a new service to production. The service needs to automatically scale using a Managed Instance Group (MIG) and should be deployed over multiple regions. The service needs a large number of resources for each instance and you need to plan for capacity. What should you do?
Correct
To deploy a new service to production that needs to automatically scale using a Managed Instance Group (MIG) over multiple regions and requires a large number of resources for each instance, the following steps can be taken: C. Validate that the resource requirements are within the available quota limits of each region: Before deploying the service, it is essential to ensure that the resources required for the instances are within the available quota limits of each region. This will help to avoid any unexpected capacity constraints and ensure that the instances can be scaled up as required. A. Use the n1-highcpu-96 machine type in the configuration of the MIG: As the service requires a large number of resources for each instance, it is essential to choose a machine type that can handle the workload. The n1-highcpu-96 machine type is suitable for compute-intensive workloads and can provide high CPU resources to the instances. D. Deploy the service in one region and use a global load balancer to route traffic to this region: To ensure high availability and fault tolerance, it is recommended to deploy the service in multiple regions. Using a global load balancer will help to route traffic to the nearest available region and provide a seamless experience to the users. B. Monitor results of Stackdriver Trace to determine the required amount of resources: Stackdriver Trace can be used to identify the performance bottlenecks in the application and determine the required amount of resources. However, this step can be performed after the initial deployment to optimize the performance of the application. References: Google Cloud Managed Instance Groups documentation: https://cloud.google.com/compute/docs/instance-groups Google Cloud Machine Types documentation: https://cloud.google.com/compute/docs/machine-types Google Cloud Global Load Balancing documentation: https://cloud.google.com/load-balancing/docs/load-balancing-overview
Incorrect
To deploy a new service to production that needs to automatically scale using a Managed Instance Group (MIG) over multiple regions and requires a large number of resources for each instance, the following steps can be taken: C. Validate that the resource requirements are within the available quota limits of each region: Before deploying the service, it is essential to ensure that the resources required for the instances are within the available quota limits of each region. This will help to avoid any unexpected capacity constraints and ensure that the instances can be scaled up as required. A. Use the n1-highcpu-96 machine type in the configuration of the MIG: As the service requires a large number of resources for each instance, it is essential to choose a machine type that can handle the workload. The n1-highcpu-96 machine type is suitable for compute-intensive workloads and can provide high CPU resources to the instances. D. Deploy the service in one region and use a global load balancer to route traffic to this region: To ensure high availability and fault tolerance, it is recommended to deploy the service in multiple regions. Using a global load balancer will help to route traffic to the nearest available region and provide a seamless experience to the users. B. Monitor results of Stackdriver Trace to determine the required amount of resources: Stackdriver Trace can be used to identify the performance bottlenecks in the application and determine the required amount of resources. However, this step can be performed after the initial deployment to optimize the performance of the application. References: Google Cloud Managed Instance Groups documentation: https://cloud.google.com/compute/docs/instance-groups Google Cloud Machine Types documentation: https://cloud.google.com/compute/docs/machine-types Google Cloud Global Load Balancing documentation: https://cloud.google.com/load-balancing/docs/load-balancing-overview
Unattempted
To deploy a new service to production that needs to automatically scale using a Managed Instance Group (MIG) over multiple regions and requires a large number of resources for each instance, the following steps can be taken: C. Validate that the resource requirements are within the available quota limits of each region: Before deploying the service, it is essential to ensure that the resources required for the instances are within the available quota limits of each region. This will help to avoid any unexpected capacity constraints and ensure that the instances can be scaled up as required. A. Use the n1-highcpu-96 machine type in the configuration of the MIG: As the service requires a large number of resources for each instance, it is essential to choose a machine type that can handle the workload. The n1-highcpu-96 machine type is suitable for compute-intensive workloads and can provide high CPU resources to the instances. D. Deploy the service in one region and use a global load balancer to route traffic to this region: To ensure high availability and fault tolerance, it is recommended to deploy the service in multiple regions. Using a global load balancer will help to route traffic to the nearest available region and provide a seamless experience to the users. B. Monitor results of Stackdriver Trace to determine the required amount of resources: Stackdriver Trace can be used to identify the performance bottlenecks in the application and determine the required amount of resources. However, this step can be performed after the initial deployment to optimize the performance of the application. References: Google Cloud Managed Instance Groups documentation: https://cloud.google.com/compute/docs/instance-groups Google Cloud Machine Types documentation: https://cloud.google.com/compute/docs/machine-types Google Cloud Global Load Balancing documentation: https://cloud.google.com/load-balancing/docs/load-balancing-overview
Question 32 of 65
32. Question
As a DevOps Engineer, you have been tasked to optimize resource utilization and develop a plan to address areas of greatest cost or lowest utilization in a Google Cloud Platform project. Your project is a multi-tier web application that includes Compute Engine instances, Cloud Storage, Cloud SQL, and Cloud Pub/Sub. Which of the following strategies would be the most effective in achieving these goals?
Correct
Enable autoscaling and use managed instance groups for Compute Engine instances, optimize Cloud Storage by leveraging object lifecycle policies, employ Cloud SQL with automatic storage increase, and monitor Cloud Pub/Sub usage to adjust quotas. -> Correct. This option combines the most effective strategies for each service. Autoscaling and managed instance groups optimize Compute Engine resource utilization, object lifecycle policies help manage Cloud Storage costs, automatic storage increase in Cloud SQL ensures optimal storage usage, and monitoring Cloud Pub/Sub usage allows for efficient quota management. Migrate all Compute Engine instances to Preemptible VMs and use the Always Free tier for Cloud Storage, Cloud SQL, and Cloud Pub/Sub. -> Incorrect. This approach may not provide the required performance and availability for a multi-tier web application. Preemptible VMs have a limited lifespan and can be terminated at any time, which may not be suitable for production workloads. Additionally, the Always Free tier has limitations that might not meet the application requirements. Move the entire application to a single Compute Engine instance with maximum vCPUs and memory, and use the Always Free tier for Cloud Storage, Cloud SQL, and Cloud Pub/Sub. -> Incorrect. Consolidating the entire application into a single Compute Engine instance can lead to performance bottlenecks and single points of failure. This approach may not address the cost and utilization concerns effectively. Utilize the GCP Cost Calculator to estimate costs, and then create a custom machine type for each Compute Engine instance, use Nearline storage for all Cloud Storage objects, and disable all Cloud SQL instances and Cloud Pub/Sub topics. -> Incorrect. The GCP Cost Calculator can provide cost estimates, but it does not optimize resource utilization by itself. Using a custom machine type for each instance may not provide the desired cost savings, and disabling all Cloud SQL instances and Cloud Pub/Sub topics can cause the application to malfunction.
Incorrect
Enable autoscaling and use managed instance groups for Compute Engine instances, optimize Cloud Storage by leveraging object lifecycle policies, employ Cloud SQL with automatic storage increase, and monitor Cloud Pub/Sub usage to adjust quotas. -> Correct. This option combines the most effective strategies for each service. Autoscaling and managed instance groups optimize Compute Engine resource utilization, object lifecycle policies help manage Cloud Storage costs, automatic storage increase in Cloud SQL ensures optimal storage usage, and monitoring Cloud Pub/Sub usage allows for efficient quota management. Migrate all Compute Engine instances to Preemptible VMs and use the Always Free tier for Cloud Storage, Cloud SQL, and Cloud Pub/Sub. -> Incorrect. This approach may not provide the required performance and availability for a multi-tier web application. Preemptible VMs have a limited lifespan and can be terminated at any time, which may not be suitable for production workloads. Additionally, the Always Free tier has limitations that might not meet the application requirements. Move the entire application to a single Compute Engine instance with maximum vCPUs and memory, and use the Always Free tier for Cloud Storage, Cloud SQL, and Cloud Pub/Sub. -> Incorrect. Consolidating the entire application into a single Compute Engine instance can lead to performance bottlenecks and single points of failure. This approach may not address the cost and utilization concerns effectively. Utilize the GCP Cost Calculator to estimate costs, and then create a custom machine type for each Compute Engine instance, use Nearline storage for all Cloud Storage objects, and disable all Cloud SQL instances and Cloud Pub/Sub topics. -> Incorrect. The GCP Cost Calculator can provide cost estimates, but it does not optimize resource utilization by itself. Using a custom machine type for each instance may not provide the desired cost savings, and disabling all Cloud SQL instances and Cloud Pub/Sub topics can cause the application to malfunction.
Unattempted
Enable autoscaling and use managed instance groups for Compute Engine instances, optimize Cloud Storage by leveraging object lifecycle policies, employ Cloud SQL with automatic storage increase, and monitor Cloud Pub/Sub usage to adjust quotas. -> Correct. This option combines the most effective strategies for each service. Autoscaling and managed instance groups optimize Compute Engine resource utilization, object lifecycle policies help manage Cloud Storage costs, automatic storage increase in Cloud SQL ensures optimal storage usage, and monitoring Cloud Pub/Sub usage allows for efficient quota management. Migrate all Compute Engine instances to Preemptible VMs and use the Always Free tier for Cloud Storage, Cloud SQL, and Cloud Pub/Sub. -> Incorrect. This approach may not provide the required performance and availability for a multi-tier web application. Preemptible VMs have a limited lifespan and can be terminated at any time, which may not be suitable for production workloads. Additionally, the Always Free tier has limitations that might not meet the application requirements. Move the entire application to a single Compute Engine instance with maximum vCPUs and memory, and use the Always Free tier for Cloud Storage, Cloud SQL, and Cloud Pub/Sub. -> Incorrect. Consolidating the entire application into a single Compute Engine instance can lead to performance bottlenecks and single points of failure. This approach may not address the cost and utilization concerns effectively. Utilize the GCP Cost Calculator to estimate costs, and then create a custom machine type for each Compute Engine instance, use Nearline storage for all Cloud Storage objects, and disable all Cloud SQL instances and Cloud Pub/Sub topics. -> Incorrect. The GCP Cost Calculator can provide cost estimates, but it does not optimize resource utilization by itself. Using a custom machine type for each instance may not provide the desired cost savings, and disabling all Cloud SQL instances and Cloud Pub/Sub topics can cause the application to malfunction.
Question 33 of 65
33. Question
What is the purpose of a canary release in a continuous deployment pipeline?
Correct
To release a new version of the application to a small group of users before rolling it out to all users. -> Correct. A canary release is a technique used in a continuous deployment pipeline to release a new version of an application to a small group of users before rolling it out to all users. This allows for the new version to be tested in production and monitored for any issues, such as performance or compatibility issues, before it is released to all users. If any issues are detected, the canary release can be rolled back, and changes can be made before deploying the new version to all users. To test the performance of the application in production before deploying to all users. -> Incorrect. Testing the performance of an application in production is typically done through performance testing and monitoring, not canary releases. To ensure that all tests in the pipeline pass before deploying to production. -> Incorrect. Ensuring all tests pass before deploying to production is the purpose of a continuous integration pipeline, not a canary release. To roll back a deployment if errors are detected in the production environment. -> Incorrect. Rolling back a deployment if errors are detected is typically done through automated rollback mechanisms and not specifically related to canary releases.
Incorrect
To release a new version of the application to a small group of users before rolling it out to all users. -> Correct. A canary release is a technique used in a continuous deployment pipeline to release a new version of an application to a small group of users before rolling it out to all users. This allows for the new version to be tested in production and monitored for any issues, such as performance or compatibility issues, before it is released to all users. If any issues are detected, the canary release can be rolled back, and changes can be made before deploying the new version to all users. To test the performance of the application in production before deploying to all users. -> Incorrect. Testing the performance of an application in production is typically done through performance testing and monitoring, not canary releases. To ensure that all tests in the pipeline pass before deploying to production. -> Incorrect. Ensuring all tests pass before deploying to production is the purpose of a continuous integration pipeline, not a canary release. To roll back a deployment if errors are detected in the production environment. -> Incorrect. Rolling back a deployment if errors are detected is typically done through automated rollback mechanisms and not specifically related to canary releases.
Unattempted
To release a new version of the application to a small group of users before rolling it out to all users. -> Correct. A canary release is a technique used in a continuous deployment pipeline to release a new version of an application to a small group of users before rolling it out to all users. This allows for the new version to be tested in production and monitored for any issues, such as performance or compatibility issues, before it is released to all users. If any issues are detected, the canary release can be rolled back, and changes can be made before deploying the new version to all users. To test the performance of the application in production before deploying to all users. -> Incorrect. Testing the performance of an application in production is typically done through performance testing and monitoring, not canary releases. To ensure that all tests in the pipeline pass before deploying to production. -> Incorrect. Ensuring all tests pass before deploying to production is the purpose of a continuous integration pipeline, not a canary release. To roll back a deployment if errors are detected in the production environment. -> Incorrect. Rolling back a deployment if errors are detected is typically done through automated rollback mechanisms and not specifically related to canary releases.
Question 34 of 65
34. Question
You are responsible for designing the logging of an application. Your company has asked you to ensure logs are sent to the company’s Splunk instance. How should you accomplish this with the least amount of operation overhead?
Correct
Option A is incorrect. This introduces the overhead of managing the Cloud Function. Option B is CORRECT. This is the recommended approach for exporting logs to third-party applications. Option C is incorrect. It is not recommended, and it introduces the overhead of managing the Cloud Function. Option D is incorrect. This is not currently possible. Reference https://cloud.google.com/logging/docs/export
Incorrect
Option A is incorrect. This introduces the overhead of managing the Cloud Function. Option B is CORRECT. This is the recommended approach for exporting logs to third-party applications. Option C is incorrect. It is not recommended, and it introduces the overhead of managing the Cloud Function. Option D is incorrect. This is not currently possible. Reference https://cloud.google.com/logging/docs/export
Unattempted
Option A is incorrect. This introduces the overhead of managing the Cloud Function. Option B is CORRECT. This is the recommended approach for exporting logs to third-party applications. Option C is incorrect. It is not recommended, and it introduces the overhead of managing the Cloud Function. Option D is incorrect. This is not currently possible. Reference https://cloud.google.com/logging/docs/export
Question 35 of 65
35. Question
You are helping with the design of a data processing pipeline for a company. Data is streamed from different devices into the pipeline and then processed before it is loaded into the final storage for analytic use. You want to identify minimal Service Level Indicators (SLIs) for the pipeline to ensure that the data in the final storage is up to date. Which SLI should not be part of your consideration?
Correct
Option D is CORRECT, this is SLI provides no monitoring value for the data processing pipeline.
Options A, B & C are incorrect because they are recommended SLIs for big data systems. Throughput shows speed of processing and latency shows total time to process a request. Correctness measures the accuracy of results returned.
Option D is CORRECT, this is SLI provides no monitoring value for the data processing pipeline.
Options A, B & C are incorrect because they are recommended SLIs for big data systems. Throughput shows speed of processing and latency shows total time to process a request. Correctness measures the accuracy of results returned.
Option D is CORRECT, this is SLI provides no monitoring value for the data processing pipeline.
Options A, B & C are incorrect because they are recommended SLIs for big data systems. Throughput shows speed of processing and latency shows total time to process a request. Correctness measures the accuracy of results returned.
You are part of the SRE team tasked with writing a postmortem of an outage for one of the services your team manages.
Which of these should not be a part of the creation of the postmortem document according to the Google’s SRE best practices?
Correct
Options A, B & D are incorrect because they are part of the process for creating a postmortem, which includes figuring out what caused the issues and how to prevent it, also it is a collaborative process with the output shared.
Option C is CORRECT because according to Google’s SRE best practices, all postmortems should be reviewed as part of the culture of learning.
Options A, B & D are incorrect because they are part of the process for creating a postmortem, which includes figuring out what caused the issues and how to prevent it, also it is a collaborative process with the output shared.
Option C is CORRECT because according to Google’s SRE best practices, all postmortems should be reviewed as part of the culture of learning.
Options A, B & D are incorrect because they are part of the process for creating a postmortem, which includes figuring out what caused the issues and how to prevent it, also it is a collaborative process with the output shared.
Option C is CORRECT because according to Google’s SRE best practices, all postmortems should be reviewed as part of the culture of learning.
Your company has tasked you with setting up a Continuous Integration pipeline. When code is committed to the source repository, the pipeline will build docker containers to be pushed to Container Registry and non-container artifacts to be pushed to Cloud Storage. How would you accomplish this? (select 2)
Your company has deployed all its Cloud Source Repositories in a separate GCP Project. You have been tasked with granting permissions developers in the dev Project access to commit code to the dev repository in that Project. How can you achieve this according to Google’s best practice of least privilege?
Correct
Option A is incorrect. This is too permissive, does not follow least privilege best practice and grants access to all the repos in that Project. Option B is CORRECT. This grants permissions at the repo level to list, clone, fetch and update repositories. Option C is incorrect. This does not give permissions to update repositories. Option D is incorrect. This is too permissive and does not follow least privilege best practice. Reference https://cloud.google.com/source-repositories/docs/configure-access-control#roles_and_permissions_matrix
Incorrect
Option A is incorrect. This is too permissive, does not follow least privilege best practice and grants access to all the repos in that Project. Option B is CORRECT. This grants permissions at the repo level to list, clone, fetch and update repositories. Option C is incorrect. This does not give permissions to update repositories. Option D is incorrect. This is too permissive and does not follow least privilege best practice. Reference https://cloud.google.com/source-repositories/docs/configure-access-control#roles_and_permissions_matrix
Unattempted
Option A is incorrect. This is too permissive, does not follow least privilege best practice and grants access to all the repos in that Project. Option B is CORRECT. This grants permissions at the repo level to list, clone, fetch and update repositories. Option C is incorrect. This does not give permissions to update repositories. Option D is incorrect. This is too permissive and does not follow least privilege best practice. Reference https://cloud.google.com/source-repositories/docs/configure-access-control#roles_and_permissions_matrix
Question 39 of 65
39. Question
Your team is running a production apache application on Google Compute Engine. You currently monitor the default metrics such as CPU utilization. You have a new requirement to monitor metrics from the Apache application in the Google Cloud console. What should you do? (select2).
Correct
Options A, D and E are incorrect. Fluentd is used for logging and you have to install the monitoring (collectd) agent in order to monitor custom metrics. Options B, C are CORRECT. you have to install the monitoring (collectd) agent in order to monitor custom metrics. Reference https://cloud.google.com/monitoring/agent/plugins/apache
Incorrect
Options A, D and E are incorrect. Fluentd is used for logging and you have to install the monitoring (collectd) agent in order to monitor custom metrics. Options B, C are CORRECT. you have to install the monitoring (collectd) agent in order to monitor custom metrics. Reference https://cloud.google.com/monitoring/agent/plugins/apache
Unattempted
Options A, D and E are incorrect. Fluentd is used for logging and you have to install the monitoring (collectd) agent in order to monitor custom metrics. Options B, C are CORRECT. you have to install the monitoring (collectd) agent in order to monitor custom metrics. Reference https://cloud.google.com/monitoring/agent/plugins/apache
Question 40 of 65
40. Question
Your company is serving an application through the Compute Engine service behind a global load balancer. You have been tasked with monitoring the availability of the application and alert the on-call engineer if the application is unavailable for more than five minutes. What should you do with the least management overhead?
Correct
Option A is incorrect, this has a lot of overhead such as installing the logging agent, configuring the right logs to be sent to Cloud Logging and creating log-based metrics. Option B is incorrect, because a service on the instance may not work when the instance fails. Option C is incorrect. This has a lot of overhead, assuming there are 50 instances, you will have to create an uptime check and alerting policy for each VM instance. Also if one VM is replaced it will trigger an alert which is counter-productive when the application is available. Option D is CORRECT. Creating an uptime check to the load balancer has the least administrative overhead. Reference https://cloud.google.com/monitoring/uptime-checks
Incorrect
Option A is incorrect, this has a lot of overhead such as installing the logging agent, configuring the right logs to be sent to Cloud Logging and creating log-based metrics. Option B is incorrect, because a service on the instance may not work when the instance fails. Option C is incorrect. This has a lot of overhead, assuming there are 50 instances, you will have to create an uptime check and alerting policy for each VM instance. Also if one VM is replaced it will trigger an alert which is counter-productive when the application is available. Option D is CORRECT. Creating an uptime check to the load balancer has the least administrative overhead. Reference https://cloud.google.com/monitoring/uptime-checks
Unattempted
Option A is incorrect, this has a lot of overhead such as installing the logging agent, configuring the right logs to be sent to Cloud Logging and creating log-based metrics. Option B is incorrect, because a service on the instance may not work when the instance fails. Option C is incorrect. This has a lot of overhead, assuming there are 50 instances, you will have to create an uptime check and alerting policy for each VM instance. Also if one VM is replaced it will trigger an alert which is counter-productive when the application is available. Option D is CORRECT. Creating an uptime check to the load balancer has the least administrative overhead. Reference https://cloud.google.com/monitoring/uptime-checks
Question 41 of 65
41. Question
You provide support for a Python application in production on Compute Engine. In recent times there have been complaints about the slow response of the application. You want to investigate how requests propagate through your entire application. Which should you do?
Correct
Cloud Trace shows how requests propagate through the different components (microservices or functions) of an application. Options B and C are incorrect. The monitoring and logging agents do not show how requests propagate through the different component (microservices or functions) of an application Option D is incorrect. CPU Utilization does not show how requests propagate through the different components (microservices or functions) of an application. Reference https://cloud.google.com/trace/docs/setup
Incorrect
Cloud Trace shows how requests propagate through the different components (microservices or functions) of an application. Options B and C are incorrect. The monitoring and logging agents do not show how requests propagate through the different component (microservices or functions) of an application Option D is incorrect. CPU Utilization does not show how requests propagate through the different components (microservices or functions) of an application. Reference https://cloud.google.com/trace/docs/setup
Unattempted
Cloud Trace shows how requests propagate through the different components (microservices or functions) of an application. Options B and C are incorrect. The monitoring and logging agents do not show how requests propagate through the different component (microservices or functions) of an application Option D is incorrect. CPU Utilization does not show how requests propagate through the different components (microservices or functions) of an application. Reference https://cloud.google.com/trace/docs/setup
Question 42 of 65
42. Question
Your team is creating an incident management procedure which will be a guide for your team during incidents. Part of Google‘s SRE incident management best practice is the separation of responsibilities. Which of the following responsibilities is not essential during an incident?
Correct
Options A, B & C are incorrect. These roles represent the Incident Commander, Operations Lead and Communications lead which are the essential roles for incident management. Options D is CORRECT. Creating the incident management procedure is a team effort so anyone can use it when an incident occurs. Reference https://sre.google/sre-book/managing-incidents/
Incorrect
Options A, B & C are incorrect. These roles represent the Incident Commander, Operations Lead and Communications lead which are the essential roles for incident management. Options D is CORRECT. Creating the incident management procedure is a team effort so anyone can use it when an incident occurs. Reference https://sre.google/sre-book/managing-incidents/
Unattempted
Options A, B & C are incorrect. These roles represent the Incident Commander, Operations Lead and Communications lead which are the essential roles for incident management. Options D is CORRECT. Creating the incident management procedure is a team effort so anyone can use it when an incident occurs. Reference https://sre.google/sre-book/managing-incidents/
Question 43 of 65
43. Question
Your team recently pushed an update to production. Several customers are now complaining that the service is taking too long to respond. What should you do first following Google’s SRE best practice for effective troubleshooting?
Correct
The correct answer is A. Try to figure out the severity of the issue. According to Google‘s SRE best practice for effective troubleshooting, the first step is to understand the scope of the issue. This includes determining how many customers are affected, how severe the impact is, and whether or not the issue is widespread. Once you have a good understanding of the severity of the issue, you can start to troubleshoot the problem. Opening a bug ticket or reviewing application logs can be helpful, but they should not be done before you have a good understanding of the severity of the issue. This is because you may be wasting your time troubleshooting a problem that is not actually affecting a large number of customers. Making the system work as well as it can while you troubleshoot is also a good idea, but it is not the first step. You need to understand the severity of the issue before you can make any changes to the system. Here are the steps on how to troubleshoot an issue following Google‘s SRE best practice for effective troubleshooting: 1. Understand the severity of the issue. 2. Identify the affected components. 3. Gather data. 4. Analyze the data. 5. Reproduce the issue. 6. Fix the issue. 7. Roll back the changes. 8. Monitor the system.
Incorrect
The correct answer is A. Try to figure out the severity of the issue. According to Google‘s SRE best practice for effective troubleshooting, the first step is to understand the scope of the issue. This includes determining how many customers are affected, how severe the impact is, and whether or not the issue is widespread. Once you have a good understanding of the severity of the issue, you can start to troubleshoot the problem. Opening a bug ticket or reviewing application logs can be helpful, but they should not be done before you have a good understanding of the severity of the issue. This is because you may be wasting your time troubleshooting a problem that is not actually affecting a large number of customers. Making the system work as well as it can while you troubleshoot is also a good idea, but it is not the first step. You need to understand the severity of the issue before you can make any changes to the system. Here are the steps on how to troubleshoot an issue following Google‘s SRE best practice for effective troubleshooting: 1. Understand the severity of the issue. 2. Identify the affected components. 3. Gather data. 4. Analyze the data. 5. Reproduce the issue. 6. Fix the issue. 7. Roll back the changes. 8. Monitor the system.
Unattempted
The correct answer is A. Try to figure out the severity of the issue. According to Google‘s SRE best practice for effective troubleshooting, the first step is to understand the scope of the issue. This includes determining how many customers are affected, how severe the impact is, and whether or not the issue is widespread. Once you have a good understanding of the severity of the issue, you can start to troubleshoot the problem. Opening a bug ticket or reviewing application logs can be helpful, but they should not be done before you have a good understanding of the severity of the issue. This is because you may be wasting your time troubleshooting a problem that is not actually affecting a large number of customers. Making the system work as well as it can while you troubleshoot is also a good idea, but it is not the first step. You need to understand the severity of the issue before you can make any changes to the system. Here are the steps on how to troubleshoot an issue following Google‘s SRE best practice for effective troubleshooting: 1. Understand the severity of the issue. 2. Identify the affected components. 3. Gather data. 4. Analyze the data. 5. Reproduce the issue. 6. Fix the issue. 7. Roll back the changes. 8. Monitor the system.
Question 44 of 65
44. Question
Your team is managing multiple Projects with different applications. You have been asked to centralize all billing data for the projects for ease of analysis. What steps should you take, following Google’s best practice? (select 2)
You are a devops engineer on a large-scale application development for a multinational company. The development, testing and production environment consists of several Projects. You have been tasked with designing and implementing a billing export for the multiple Projects to a central billing Project. Following the principle of least privilege, what role will be needed?
Your team manages several applications in different Projects with a central billing Project. There is a requirement from finance to provide the ability for billing breakdown according to departments or projects in BigQuery. How would you accomplish this?
You are responsible for the VPC network design of an application that your team will be deploying on Compute Engine (GCE). Minimal cost for Internet egress traffic charges is a requirement. Which Network Service Tier option provides the lowest cost?
Correct
Option A is incorrect. This is more expensive. Option B is incorrect. The question is about Network Service Tiers. Option C is CORRECT. This tier provides a cheaper internet egress rate. Option D is incorrect. There are two tiers (Premium and Standard). Reference https://cloud.google.com/vpc/network-pricing#internet_egress
Incorrect
Option A is incorrect. This is more expensive. Option B is incorrect. The question is about Network Service Tiers. Option C is CORRECT. This tier provides a cheaper internet egress rate. Option D is incorrect. There are two tiers (Premium and Standard). Reference https://cloud.google.com/vpc/network-pricing#internet_egress
Unattempted
Option A is incorrect. This is more expensive. Option B is incorrect. The question is about Network Service Tiers. Option C is CORRECT. This tier provides a cheaper internet egress rate. Option D is incorrect. There are two tiers (Premium and Standard). Reference https://cloud.google.com/vpc/network-pricing#internet_egress
Question 48 of 65
48. Question
You are on-call managing an application in production. You receive alerts from the monitoring system of the application which show it is failing uptime checks. What do you do first following SRE best practice of managing incidents?
Correct
Option A is incorrect. This is done at a later stage after the application is back online. Option B is incorrect. This focuses on just the technical problem and does not cover the bigger picture as an SRE. Option C is incorrect. This is not the first thing because you don’t know what the real problem is yet. Option D is CORRECT. This is Google’s recommended approach for incident management. Investigate the problem, if it persists, appoint an incident commander to oversee the resolution. Reference https://sre.google/sre-book/managing-incidents/
Incorrect
Option A is incorrect. This is done at a later stage after the application is back online. Option B is incorrect. This focuses on just the technical problem and does not cover the bigger picture as an SRE. Option C is incorrect. This is not the first thing because you don’t know what the real problem is yet. Option D is CORRECT. This is Google’s recommended approach for incident management. Investigate the problem, if it persists, appoint an incident commander to oversee the resolution. Reference https://sre.google/sre-book/managing-incidents/
Unattempted
Option A is incorrect. This is done at a later stage after the application is back online. Option B is incorrect. This focuses on just the technical problem and does not cover the bigger picture as an SRE. Option C is incorrect. This is not the first thing because you don’t know what the real problem is yet. Option D is CORRECT. This is Google’s recommended approach for incident management. Investigate the problem, if it persists, appoint an incident commander to oversee the resolution. Reference https://sre.google/sre-book/managing-incidents/
Question 49 of 65
49. Question
Your team is planning to deploy an application to App Engine in the production Project. You need to be able to inspect the state of the app in real time, without stopping or slowing it down. How can you accomplish this?
Correct
Option D is the correct choice to inspect the state of the app in real-time. Cloud Debugger allows real-time inspection of the state of an application running on Google Cloud Platform without stopping or slowing it down. It enables developers to examine the state of an application, including variables and call stack at any code location, without using print statements or stopping the application. Option A, Cloud Monitoring, is used to collect and view metrics, create alerts, and visualize time-series data. It is not designed to inspect the state of an application in real-time. Option B, Cloud Logging, is used to store, search, analyze, monitor, and alert on log data and events. While it is useful for analyzing logs, it may not be the best choice for real-time inspection of the state of an application. Option C, Cloud Profiler, is used to inspect application performance and find bottlenecks. It does not provide real-time inspection of the state of an application. References: Cloud Debugger. https://cloud.google.com/debugger Cloud Monitoring. https://cloud.google.com/monitoring Cloud Logging. https://cloud.google.com/logging Cloud Profiler. https://cloud.google.com/profiler
Incorrect
Option D is the correct choice to inspect the state of the app in real-time. Cloud Debugger allows real-time inspection of the state of an application running on Google Cloud Platform without stopping or slowing it down. It enables developers to examine the state of an application, including variables and call stack at any code location, without using print statements or stopping the application. Option A, Cloud Monitoring, is used to collect and view metrics, create alerts, and visualize time-series data. It is not designed to inspect the state of an application in real-time. Option B, Cloud Logging, is used to store, search, analyze, monitor, and alert on log data and events. While it is useful for analyzing logs, it may not be the best choice for real-time inspection of the state of an application. Option C, Cloud Profiler, is used to inspect application performance and find bottlenecks. It does not provide real-time inspection of the state of an application. References: Cloud Debugger. https://cloud.google.com/debugger Cloud Monitoring. https://cloud.google.com/monitoring Cloud Logging. https://cloud.google.com/logging Cloud Profiler. https://cloud.google.com/profiler
Unattempted
Option D is the correct choice to inspect the state of the app in real-time. Cloud Debugger allows real-time inspection of the state of an application running on Google Cloud Platform without stopping or slowing it down. It enables developers to examine the state of an application, including variables and call stack at any code location, without using print statements or stopping the application. Option A, Cloud Monitoring, is used to collect and view metrics, create alerts, and visualize time-series data. It is not designed to inspect the state of an application in real-time. Option B, Cloud Logging, is used to store, search, analyze, monitor, and alert on log data and events. While it is useful for analyzing logs, it may not be the best choice for real-time inspection of the state of an application. Option C, Cloud Profiler, is used to inspect application performance and find bottlenecks. It does not provide real-time inspection of the state of an application. References: Cloud Debugger. https://cloud.google.com/debugger Cloud Monitoring. https://cloud.google.com/monitoring Cloud Logging. https://cloud.google.com/logging Cloud Profiler. https://cloud.google.com/profiler
Question 50 of 65
50. Question
When creating an Incident Document according to Google SRE‘s best practices, it is recommended to include an incident timeline, list of actions carried out to restore the service, and command hierarchy. However, including the developers responsible for the update is not recommended as it goes against the principle of blameless postmortems. Blameless postmortems are a critical component of Site Reliability Engineering. The goal of a postmortem is to identify the root cause of an incident and to prevent a similar incident from occurring in the future. Blameless postmortems focus on the systems and processes that led to the incident, rather than the individual people involved. By creating a blameless culture, it encourages individuals to take risks and try new things without fear of retribution. Therefore, including the developers responsible for the update in the Incident Document is not recommended. Instead, the focus should be on identifying the contributing factors that led to the incident and developing actionable steps to prevent a similar incident from occurring in the future. References: Google SRE Handbook: https://sre.google/sre-book/table-of-contents/ Blameless Postmortems: https://sre.google/workbook/improving-incident-management/#blameless-postmortems
Correct
When creating an Incident Document according to Google SRE‘s best practices, it is recommended to include an incident timeline, list of actions carried out to restore the service, and command hierarchy. However, including the developers responsible for the update is not recommended as it goes against the principle of blameless postmortems. Blameless postmortems are a critical component of Site Reliability Engineering. The goal of a postmortem is to identify the root cause of an incident and to prevent a similar incident from occurring in the future. Blameless postmortems focus on the systems and processes that led to the incident, rather than the individual people involved. By creating a blameless culture, it encourages individuals to take risks and try new things without fear of retribution. Therefore, including the developers responsible for the update in the Incident Document is not recommended. Instead, the focus should be on identifying the contributing factors that led to the incident and developing actionable steps to prevent a similar incident from occurring in the future. References: Google SRE Handbook: https://sre.google/sre-book/table-of-contents/ Blameless Postmortems: https://sre.google/workbook/improving-incident-management/#blameless-postmortems
Incorrect
When creating an Incident Document according to Google SRE‘s best practices, it is recommended to include an incident timeline, list of actions carried out to restore the service, and command hierarchy. However, including the developers responsible for the update is not recommended as it goes against the principle of blameless postmortems. Blameless postmortems are a critical component of Site Reliability Engineering. The goal of a postmortem is to identify the root cause of an incident and to prevent a similar incident from occurring in the future. Blameless postmortems focus on the systems and processes that led to the incident, rather than the individual people involved. By creating a blameless culture, it encourages individuals to take risks and try new things without fear of retribution. Therefore, including the developers responsible for the update in the Incident Document is not recommended. Instead, the focus should be on identifying the contributing factors that led to the incident and developing actionable steps to prevent a similar incident from occurring in the future. References: Google SRE Handbook: https://sre.google/sre-book/table-of-contents/ Blameless Postmortems: https://sre.google/workbook/improving-incident-management/#blameless-postmortems
Unattempted
When creating an Incident Document according to Google SRE‘s best practices, it is recommended to include an incident timeline, list of actions carried out to restore the service, and command hierarchy. However, including the developers responsible for the update is not recommended as it goes against the principle of blameless postmortems. Blameless postmortems are a critical component of Site Reliability Engineering. The goal of a postmortem is to identify the root cause of an incident and to prevent a similar incident from occurring in the future. Blameless postmortems focus on the systems and processes that led to the incident, rather than the individual people involved. By creating a blameless culture, it encourages individuals to take risks and try new things without fear of retribution. Therefore, including the developers responsible for the update in the Incident Document is not recommended. Instead, the focus should be on identifying the contributing factors that led to the incident and developing actionable steps to prevent a similar incident from occurring in the future. References: Google SRE Handbook: https://sre.google/sre-book/table-of-contents/ Blameless Postmortems: https://sre.google/workbook/improving-incident-management/#blameless-postmortems
Question 51 of 65
51. Question
You are responsible for designing a new logs collection system in your organization. Your company has asked you to ensure all audit logs from all projects in the organization are aggregated in one location. How should you accomplish this? (select 2)
Correct
Option A is incorrect. In the console, the Logging bucket is created in Logging under Logs Storage, not in Cloud Storage. Options B and D are CORRECT. Creating the logging bucket is done in Logging and the sink for the organization logs is done via cli. Option C is incorrect This cannot be created in the Console Option E is incorrect. The question asks for aggregating logs in one location, this does not support that. Reference https://cloud.google.com/logging/docs/central-log-storage
Incorrect
Option A is incorrect. In the console, the Logging bucket is created in Logging under Logs Storage, not in Cloud Storage. Options B and D are CORRECT. Creating the logging bucket is done in Logging and the sink for the organization logs is done via cli. Option C is incorrect This cannot be created in the Console Option E is incorrect. The question asks for aggregating logs in one location, this does not support that. Reference https://cloud.google.com/logging/docs/central-log-storage
Unattempted
Option A is incorrect. In the console, the Logging bucket is created in Logging under Logs Storage, not in Cloud Storage. Options B and D are CORRECT. Creating the logging bucket is done in Logging and the sink for the organization logs is done via cli. Option C is incorrect This cannot be created in the Console Option E is incorrect. The question asks for aggregating logs in one location, this does not support that. Reference https://cloud.google.com/logging/docs/central-log-storage
Question 52 of 65
52. Question
You are responsible for designing a CI/CD pipeline in your organization. Your company has asked you to ensure all data access logs for the pipeline is turned on and kept for at least 90 days. What should you take into consideration before Data Access logs are turned on? (select 3)
You are responsible for designing a CICD pipeline in your organization. Your company has asked you to ensure all the continuous deployment (CD) part of the pipeline can handle Blue/Green deployment. How could you accomplish this? (select 2)
Your team is designing a CICD pipeline for your organization. Jenkins was chosen as the Continuous Deployment Tool. Following GCP’s recommended practice, how should the CD Tool be deployed? (select 2)
Your team is designing a web-facing application for your organization. The application is intended to serve users globally. Your job is to plan for the capacity of the application. Following GCP’s SRE best practice for capacity management, which of these is not recommended?
Correct
Option A, B and C are incorrect. These are the recommended approaches for capacity planning. Load testing shows how the application will scale or fail under load; while monitoring allows for approach remedial actions to be taken while graceful degradation allows the application to function in cases of overwhelming request by rejecting requests so the system is not overloaded. Option D is CORRECT. This is not recommended because it is costly and there is no guarantee the application will need that amount of resources. Reference https://static.googleusercontent.com/media/sre.google/en//static/pdf/login_winter20_10_torres.pdf
Incorrect
Option A, B and C are incorrect. These are the recommended approaches for capacity planning. Load testing shows how the application will scale or fail under load; while monitoring allows for approach remedial actions to be taken while graceful degradation allows the application to function in cases of overwhelming request by rejecting requests so the system is not overloaded. Option D is CORRECT. This is not recommended because it is costly and there is no guarantee the application will need that amount of resources. Reference https://static.googleusercontent.com/media/sre.google/en//static/pdf/login_winter20_10_torres.pdf
Unattempted
Option A, B and C are incorrect. These are the recommended approaches for capacity planning. Load testing shows how the application will scale or fail under load; while monitoring allows for approach remedial actions to be taken while graceful degradation allows the application to function in cases of overwhelming request by rejecting requests so the system is not overloaded. Option D is CORRECT. This is not recommended because it is costly and there is no guarantee the application will need that amount of resources. Reference https://static.googleusercontent.com/media/sre.google/en//static/pdf/login_winter20_10_torres.pdf
Question 56 of 65
56. Question
You are responsible for deploying a web-facing application. The application will serve users in multiple regions. There is a reliability requirement for the system not to be overloaded with requests during peak periods. Following GCP’s SRE best practice, which of these is not recommended?
Correct
Option A, B and D are incorrect. They are the recommended approach for managing requests during peak periods to reduce/prevent cascading failures. Option C is CORRECT. This is not recommended because the intra-layer communication is susceptible to a distributed deadlock. Reference https://sre.google/sre-book/addressing-cascading-failures/
Incorrect
Option A, B and D are incorrect. They are the recommended approach for managing requests during peak periods to reduce/prevent cascading failures. Option C is CORRECT. This is not recommended because the intra-layer communication is susceptible to a distributed deadlock. Reference https://sre.google/sre-book/addressing-cascading-failures/
Unattempted
Option A, B and D are incorrect. They are the recommended approach for managing requests during peak periods to reduce/prevent cascading failures. Option C is CORRECT. This is not recommended because the intra-layer communication is susceptible to a distributed deadlock. Reference https://sre.google/sre-book/addressing-cascading-failures/
Question 57 of 65
57. Question
Your team uses Docker images to build applications. There is a requirement for exploits to be detected in Docker images built using Cloud Build before they are used in deployments. You have been tasked with deploying the process to detect vulnerabilities in built images before they are deployed. What steps can you take to achieve this? (select 2)
Correct
Vulnerability scanning can be used to scan images in Container Registry or Artifact Registry. Options A, B, and D is incorrect. Vulnerability scanning is not integrated with these services, also Images are only stored in Container Registry or Artifact Registry. Reference https://cloud.google.com/container-analysis/docs/get-image-vulnerabilities
Incorrect
Vulnerability scanning can be used to scan images in Container Registry or Artifact Registry. Options A, B, and D is incorrect. Vulnerability scanning is not integrated with these services, also Images are only stored in Container Registry or Artifact Registry. Reference https://cloud.google.com/container-analysis/docs/get-image-vulnerabilities
Unattempted
Vulnerability scanning can be used to scan images in Container Registry or Artifact Registry. Options A, B, and D is incorrect. Vulnerability scanning is not integrated with these services, also Images are only stored in Container Registry or Artifact Registry. Reference https://cloud.google.com/container-analysis/docs/get-image-vulnerabilities
Question 58 of 65
58. Question
To meet industry compliance, your company has asked you to configure VPC Flow Logs. A key priority is to streamline the logs collected from Flow Logs to reduce storage costs. What steps can you take to achieve this? (select 2)
Correct
Filtering and Metadata annotations are ways of modifying the number of logs generated and stored from VPC Flow Logs. Option D is incorrect. This is used for logs generated by infrastructure such GKE and GCE, not VPC Flow Logs Options C and E is incorrect. They do not streamline the logs collected; they focus more on storage of the logs. Reference https://cloud.google.com/vpc/docs/flow-logs
Incorrect
Filtering and Metadata annotations are ways of modifying the number of logs generated and stored from VPC Flow Logs. Option D is incorrect. This is used for logs generated by infrastructure such GKE and GCE, not VPC Flow Logs Options C and E is incorrect. They do not streamline the logs collected; they focus more on storage of the logs. Reference https://cloud.google.com/vpc/docs/flow-logs
Unattempted
Filtering and Metadata annotations are ways of modifying the number of logs generated and stored from VPC Flow Logs. Option D is incorrect. This is used for logs generated by infrastructure such GKE and GCE, not VPC Flow Logs Options C and E is incorrect. They do not streamline the logs collected; they focus more on storage of the logs. Reference https://cloud.google.com/vpc/docs/flow-logs
Question 59 of 65
59. Question
To meet security compliance of centrally collecting VPC Flow Logs, your company asked you to configure a Logs routing sink. The Sink destination is a Logging bucket in another project. After you configure the Logs Sink, a few days later one of the security team members points out that there are no logs in the logging bucket. Which of the following is not a possible reason?
Correct
Options A, C & D are incorrect. If Flow Logs are not enabled on the subnets to be monitored there will be no logs, also if logs exclusion filters are wrongly configured desired logs would be discarded. If the security team is looking in the wrong bucket there will not see the logs. Option B is CORRECT. Firewall rules do not affect the logs generated by Flow logs. Reference https://cloud.google.com/vpc/docs/using-flow-logs#no-vpc-flows
Incorrect
Options A, C & D are incorrect. If Flow Logs are not enabled on the subnets to be monitored there will be no logs, also if logs exclusion filters are wrongly configured desired logs would be discarded. If the security team is looking in the wrong bucket there will not see the logs. Option B is CORRECT. Firewall rules do not affect the logs generated by Flow logs. Reference https://cloud.google.com/vpc/docs/using-flow-logs#no-vpc-flows
Unattempted
Options A, C & D are incorrect. If Flow Logs are not enabled on the subnets to be monitored there will be no logs, also if logs exclusion filters are wrongly configured desired logs would be discarded. If the security team is looking in the wrong bucket there will not see the logs. Option B is CORRECT. Firewall rules do not affect the logs generated by Flow logs. Reference https://cloud.google.com/vpc/docs/using-flow-logs#no-vpc-flows
Question 60 of 65
60. Question
Your Site Reliability (SRE) team members manage an application deployed in three regions. The application is deployed on Managed Instance Groups placed behind a global HTTP(S) Load balancer. You are applying a critical security patch to the Compute Engines. You successfully patch the instances in the first 2 regions, but you made an error in the patching of the third region which causes requests to that region to fail. You want to mitigate the impact of unsuccessful patching on users. What should you do?
Correct
Options A and B are incorrect. These options try to fix the problem immediately with no guarantees that it will solve the problem thereby increasing the Mean Time to Repair (MTTR) Option C is incorrect. Increasing the number of instances does not mitigate the current incident because it will have the same error as the other instances in the Managed Instance Group. Option D is CORRECT. The recommended approach to make the system work as well as it can under the circumstances. This gives you time to fix the errors in region 3 and apply a new patch. References https://sre.google/sre-book/effective-troubleshooting/ https://cloud.google.com/load-balancing/docs/enabling-connection-draining
Incorrect
Options A and B are incorrect. These options try to fix the problem immediately with no guarantees that it will solve the problem thereby increasing the Mean Time to Repair (MTTR) Option C is incorrect. Increasing the number of instances does not mitigate the current incident because it will have the same error as the other instances in the Managed Instance Group. Option D is CORRECT. The recommended approach to make the system work as well as it can under the circumstances. This gives you time to fix the errors in region 3 and apply a new patch. References https://sre.google/sre-book/effective-troubleshooting/ https://cloud.google.com/load-balancing/docs/enabling-connection-draining
Unattempted
Options A and B are incorrect. These options try to fix the problem immediately with no guarantees that it will solve the problem thereby increasing the Mean Time to Repair (MTTR) Option C is incorrect. Increasing the number of instances does not mitigate the current incident because it will have the same error as the other instances in the Managed Instance Group. Option D is CORRECT. The recommended approach to make the system work as well as it can under the circumstances. This gives you time to fix the errors in region 3 and apply a new patch. References https://sre.google/sre-book/effective-troubleshooting/ https://cloud.google.com/load-balancing/docs/enabling-connection-draining
Question 61 of 65
61. Question
Your team manages a financial application for an organisation. You have been given a requirement to preserve the logs from the application for 10years as part of a compliance process. Logs will be reviewed once a year. What is the most cost-effective way to achieve this?
Your company has several Google Projects. As part of the CI/CD pipeline it has a Project where automated Compute and Docker Image creation is done. Users in the developer, staging and Production Projects require access to the images created for deployments. Following principle of least privilege, what IAM role would you need to assign to users to achieve this?
Correct
Assign the compute.imageUser role to users in the Project where the images are created. Option B is incorrect. This role is too permissive. Options C and D is incorrect. The role is assigned in the Project where the images are created. Reference https://cloud.google.com/compute/docs/images/image-management-best-practices
Incorrect
Assign the compute.imageUser role to users in the Project where the images are created. Option B is incorrect. This role is too permissive. Options C and D is incorrect. The role is assigned in the Project where the images are created. Reference https://cloud.google.com/compute/docs/images/image-management-best-practices
Unattempted
Assign the compute.imageUser role to users in the Project where the images are created. Option B is incorrect. This role is too permissive. Options C and D is incorrect. The role is assigned in the Project where the images are created. Reference https://cloud.google.com/compute/docs/images/image-management-best-practices
Question 63 of 65
63. Question
You are developing a mobile application for a financial institution. A key security requirement is that application passwords are changed frequently. The application will comprise two parts; the frontend deployed on Google Kubernetes Engine and the database is Google Cloud SQL. You need a secure way to pass the database credentials to the application at runtime and also meet the security requirement. How can you achieve this following best practice?
Correct
Options A and B are incorrect. These do not follow best practice. Storing credentials in the application is not recommended and also injecting the credentials into the application is also not recommended because that means the credentials gets stored in the application code. Option C is incorrect. You currently cannot configure secret rotation via the console. Option D is CORRECT. Secrets rotation policies can only be done through the API or gcloud commands Reference https://cloud.google.com/secret-manager/docs/secret-rotation
Incorrect
Options A and B are incorrect. These do not follow best practice. Storing credentials in the application is not recommended and also injecting the credentials into the application is also not recommended because that means the credentials gets stored in the application code. Option C is incorrect. You currently cannot configure secret rotation via the console. Option D is CORRECT. Secrets rotation policies can only be done through the API or gcloud commands Reference https://cloud.google.com/secret-manager/docs/secret-rotation
Unattempted
Options A and B are incorrect. These do not follow best practice. Storing credentials in the application is not recommended and also injecting the credentials into the application is also not recommended because that means the credentials gets stored in the application code. Option C is incorrect. You currently cannot configure secret rotation via the console. Option D is CORRECT. Secrets rotation policies can only be done through the API or gcloud commands Reference https://cloud.google.com/secret-manager/docs/secret-rotation
Question 64 of 65
64. Question
Your company has decided to migrate from on-premises to Google Cloud. The first environment to be migrated is the development and testing environments. Currently each environment is fully documented, consists of a network with 3 subnets, several firewall rules, routes, VMs, Storage, Databases and DNS. The environments need to be consistent and immutable. Following best practice, how would you deploy the environments and make them reproducible with little overhead?
Correct
Option A, C and E are incorrect. This does not follow best practice of automating infrastructure creation for reproducibility. Option B is CORRECT. This is Google’s best practice for creating Infrastructure as Code. The templates can be version controlled with very minimal overhead. Option D is incorrect. This will introduce the overhead of managing the Cloud Function. References https://cloud.google.com/deployment-manager/docs/quickstart https://cloud.google.com/docs/terraform
Incorrect
Option A, C and E are incorrect. This does not follow best practice of automating infrastructure creation for reproducibility. Option B is CORRECT. This is Google’s best practice for creating Infrastructure as Code. The templates can be version controlled with very minimal overhead. Option D is incorrect. This will introduce the overhead of managing the Cloud Function. References https://cloud.google.com/deployment-manager/docs/quickstart https://cloud.google.com/docs/terraform
Unattempted
Option A, C and E are incorrect. This does not follow best practice of automating infrastructure creation for reproducibility. Option B is CORRECT. This is Google’s best practice for creating Infrastructure as Code. The templates can be version controlled with very minimal overhead. Option D is incorrect. This will introduce the overhead of managing the Cloud Function. References https://cloud.google.com/deployment-manager/docs/quickstart https://cloud.google.com/docs/terraform
Question 65 of 65
65. Question
You are working on a new application development for a gambling company. The application will utilize a microservices architecture to allow for loose coupling of the different components. You are using Cloud Build to build the docker images. You have tested the build locally using the local builder, but when you try to run the build in Cloud Build it fails. Which of the following could be the problem?