Apply now »

Site Reliability Engineer Responsibilities

  • Lead initiatives with a small focussed team to materially improve reliability maturity in a predominantly on-prem environment.
  • Apply Azure Well-Architected Reliability principles pragmatically within hybrid and legacy constraints.
  • Define and embed SLIs, SLOs, and reliability targets for critical services and paths.
  • Identify systemic failure patterns in legacy .NET Framework code and prioritise remediation based on risk reduction.
  • Improve monitoring, alert quality, and operational visibility across application, infrastructure, and database layers.
  • Strengthen incident response processes, runbooks, and post-incident learning.
  • Work with developers to improve resiliency patterns within legacy code (retry logic, error handling, graceful degradation).
  • Reduce operational toil through targeted automation (PowerShell, scripting, pipeline improvements).

Apply now »