Site Reliability Engineer

Details of the offer

**RESPONSIBILITIES**:
Provide 2nd level support for customer incidents.
Use your on-call shift to prevent incidents from ever happening.
Run our infrastructure with Google Cloud Platform.
Make monitoring and alerting alert on symptoms and not on outages.
Document every action so your findings turn into repeatable actions-and then into automation.
Improve the deployment process to make it as boring as possible.
Design, build and maintain core infrastructure pieces that allow scaling to support hundreds of thousands of concurrent users.
Debug production issues across services and levels of the stack.
Plan the growth of our infrastructure.
3+ years total experience handling 24/7 high-availability customer-facing production systems.
Working knowledge of popular cloud platforms, preferably GCP.
Know your way around Linux and the Unix Shell.
Working knowledge of scripting, Bash or Python.
Experience in setting up CI/CD automation such as Jenkins or GitLab CI.
Experience with Docker, Kubernetes, Terraform, or similar technologies


Nominal Salary: To be agreed

Source: Whatjobs_Ppc

Job Function:

Requirements

Built at: 2025-06-19T16:14:28.334Z