Lead Site Reliability Engineer
Posting date: | 21 March 2025 |
---|---|
Salary: | £72,664 to £89,995 per year |
Additional salary information: | For further details on salary, please see the 'Benefits' section. |
Hours: | Full time |
Closing date: | 02 April 2025 |
Location: | FY4 5ES |
Company: | Government Recruitment Service |
Job type: | Permanent |
Job reference: | 395389/1 |
Summary
Are you someone who has excellent stakeholder management and problem-solving skills?
Do you enjoy finding the root cause of a problem and building automated solutions to ensure it doesn’t happen again?
If so, we’d love to hear from you.
As a Lead Site Reliability Engineer (SRE), you will drive the adoption of SRE best practices across the teams you work with.
You will collaborate with application development and operations engineers in the practice of Site Reliability Engineering.
Be accountable for the reliability of the applications you support.
Working with Delivery Managers, Product Managers, and other SREs as part of a multidisciplinary team, you will actively manage the work backlog and develop reliability improvements. You will also lead initiatives to automate low-value tasks while balancing project delivery demands.
You will provide technical leadership to wider operational teams, along with technical oversight of the products and services they support.
Helping to develop and support the engineers in your team, introducing new technologies or practices to improve team knowledge, skills, and capability.
As a Lead Site Reliability Engineer, you will play a pivotal role in ensuring the reliability and performance of our applications and infrastructure. You will lead by example, providing technical direction and supporting the development and progression of SREs within your team.
Key Responsibilities:
- Lead by example, provide technical direction, and support the development and progression of SREs within your team.
- Work across multiple teams as an engineering specialist, implementing organizational engineering standards.
- Support teams in building reusable, repeatable, observable, and reliable infrastructure.
- Design and develop techniques for improving application reliability, including run books, knowledge transfer, and ongoing SRE strategy within the wider engineering community.
- Collaborate with teams to investigate and resolve major or complex incidents, ensuring the right skills and expertise are available to respond effectively.
- Assess the impact of change requests in consultation with stakeholders, providing technical expertise and advice.
There will be a contractual requirement to join an “on-call” rota providing night cover 18:00-08:00 with occasional shifts 08:00-18:00 Saturday or Sunday. The cover is shared around the team and would normally equate to one shift per week.
Proud member of the Disability Confident employer scheme