Senior Site Reliability Engineer
| Posting date: | 14 November 2025 |
|---|---|
| Salary: | £57,946 to £78,517 per year |
| Hours: | Full time |
| Closing date: | 30 November 2025 |
| Location: | NE98 1YX |
| Company: | Government Recruitment Service |
| Job type: | Permanent |
| Job reference: | 435928/5 |
Summary
We are looking for Senior Site Reliability Engineers (SRE) to join one of our SRE teams at the heart of Digital Transformation.
As a Senior Site Reliability Engineer, you will drive adoption of SRE best practice across our cloud estate.
By utilising both your soft skills and technical experience, you will work with teams to ensure our standards and governance is met by onboarding our services into the cloud, through a dedicated assessment stage gate process. In turn, ensuring our citizen facing applications satisfy all the required operational and security needs for running in production.
Please note this role requires you to pass Security Check clearance. For further information, please see 'Selection process details'.
As a Senior Site Reliability Engineer, you will play a pivotal role in ensuring the reliability and performance of our applications and infrastructure.
The SRE team will put you in the position to work with application teams across the department on developing reliable and secure solutions to provide to citizens across the UK.
You will lead by example, providing technical direction and supporting other SREs within your team.
You will work with development teams from the design phase to help them use good practice and department standards when building their application infrastructure.
Additionally, responsibilities of the role will include:
- Design and develop the techniques for improving application reliability, run books, knowledge transfer across teams, and ongoing SRE strategy within your Functional and Professional Communities.
- Work collaboratively with development teams and provide guidance around best practice and ensure monitoring of applications is enabled.
- Push a mindset change within the organisation to foster engineering ownership, SRE best practice and the importance of the integrity and maintenance of the Live Service.
- Manage the error budget agreed with the product owner for the application and ensure that work is balanced in alignment with it.
- Act as the focal point for the investigation and resolution of major or complex incidents for the service, ensuring people with the right skills and expertise are proactively available to respond effectively.
- Assess the impact of change requests in consultation with stakeholders, providing technical expertise and authorising the implementation of subsequent changes.
- Coach and mentor application development and operations engineers in the practice and techniques of SRE.
- Conduct reviews for all high priority and major incidents ensuring they are done quickly and published.
- Routinely seek views and capture ideas from stakeholders and team members for improvements and encourage collaboration and innovation.
- Provide on-call support to help restore services, through dedicated run books or technical experience.
- Help to reduce toil and increase automation; by developing reliability to ensure we have a reduction of the time to live, and cost spend on repetitive tasks.
Proud member of the Disability Confident employer scheme