Operations Resilience Manager (IT Service Continuity/Capacity Management)
Posting date: | 19 August 2025 |
---|---|
Salary: | £42,641 to £45,081 per year |
Hours: | Full time |
Closing date: | 02 September 2025 |
Location: | FY4 5ES |
Company: | Government Recruitment Service |
Job type: | Permanent |
Job reference: | 421766/1 |
Summary
We’re looking for a proactive and knowledgeable Operations Resilience Manager to join DWP Digital, where you’ll play a key role in ensuring the performance, availability, and stability of live IT services. You’ll lead on IT Service Continuity Management and Capacity & Performance processes, helping to prevent or minimise disruptions caused by service loss or disaster.
You’ll work across technical teams and stakeholders to identify and manage risks, develop recovery and resilience plans, and ensure capacity aligns with business needs. Using your enterprise level IT expertise, you’ll analyse trends, forecast threats, and maintain reporting tools to support data-driven decision-making.
This is a fantastic opportunity to influence service resilience in one of the UK’s largest government departments.
As an Operations Resilience Manager, you will ensure the stability, performance, and continuity of DWP’s digital services. You’ll manage capacity across digital environments, assessing and forecasting risks to live service operations and their impact on the wider IT estate.
You’ll assure and report on the resilience and disaster recovery capability of all IT services, including maintaining the on-premise datacentre recovery plan. Working closely with technical and delivery leads, you’ll align service continuity with business needs, drive performance improvements, and identify opportunities for enhanced resilience.
Your role includes:
- Creating and maintaining dashboards and reports to monitor performance, trends and emerging issues.
- Coordinating with internal and external stakeholders to ensure value for money and adherence to Non Functional Requirements and KPIs.
- Advising on DR failover, data recovery testing, and capacity planning, tailoring solutions to individual service needs.
- Providing guidance throughout the delivery lifecycle to ensure governance and policy compliance.
- Reviewing operational resilience with Business Service Owners, managing action plans and identifying improvements.
- Supporting major incident recovery efforts, ensuring minimal business impact and timely communication.
- Evaluating service risks and issues across contracts, finance, technology, and support arrangements.
- Facilitating incident management from a capacity and DR perspective, ensuring swift and safe service restoration.
You’ll focus on customer objectives, service excellence, and a zero-tolerance approach to production outages, ensuring DWP’s digital services remain robust, responsive, and aligned with business priorities.
Participation in out-of-hours on-call incident support may be required for this role as part of a rota. However, this will not be expected immediately, and full training and support will be provided in advance.
Proud member of the Disability Confident employer scheme