Menu
Warning This job advert has expired and applications have closed.

Data Reliability Engineer

Job details
Posting date: 26 February 2025
Hours: Full time
Closing date: 28 March 2025
Location: London, EC2M 4AA
Company: NatWest Group
Job type: Permanent
Job reference: R-00250460

Summary

Join us as a Data Reliability Engineer

  • In this key role, you’ll support the improvement of non-functional and operational characteristics such as availability, performance, efficiency, change management, monitoring, security, incident response, and capacity planning of our products and services
  • You’ll enjoy significant stakeholder interaction, working in collaboration with engineers to ensure a principled approach to deliver change in a safe and secure way
  • This is a chance to join an inclusive team with a collaborative ethos and a commitment to innovation and professional development

What you'll do

As our Data Reliability Engineer, you’ll work alongside colleagues and feature team members to meet defined service level objectives and continually improve systems and environments. You’ll proactively contribute new ideas and innovations to meet short term and longer term goals whilst at the same time balancing and managing risk.

You’ll also be accountable for the day-to-day health of both production and non-production environments, responding to incidents as required.

A typical day will involve:

  • Providing structure and supporting release processes, suggesting and making improvements where possible
  • Supporting the clear communication and frequent update of incident status to other teams and customers
  • Providing technical expertise and input to establish the risk tolerance of products and services
  • Supporting the maintenance of services once they are live by measuring and monitoring availability, latency, and overall system health
  • Working closely with stakeholders to manage customer incident and support Service management

The skills you'll need

We’re looking for someone with strong knowledge of reliability systems thinking and experience of software engineering. You’ll need experience of using a data driven and scientific approach to fact finding. We’ll also look for financial services knowledge, and the ability to identify wider business impact, risk and opportunity, and make connections across key outputs and processes.

You'll also need:

  • Expert AWS skills using key AWS tools including EMR, Airflow, DynamoDB, S3, RDS, ElsacticBeanStalk, EC2, EMR, Kinesis, Lambda, and CloudWatch
  • Strong experience in programming language such as Scala, Spark, Python, shell, pyspark with experience in DevOps Tools like Git, bitbucket, Jenkins, and Artifactory
  • Experience in collaborative data debugging with strong ability to debug complex data issues from splunk, spark and cloudwatch audit logs
  • Experience of using a data driven and scientific approach to fact finding