🌿 Back to all jobs
🥝 Senior Service Reliability Engineer
ThoughtWorks | Singapore, Singapore | Posted May 26, 2026
Job Description
Job responsibilities
You will improve site reliability by building mechanisms/architectures that enable fault tolerance and faster median time to respond and median time to detect.You will drive the integration of observability automation into the CI/CD pipeline.You will handle production incidents, manage incident communication with clients and draft root cause analysis documents.You will monitor performance of production systems and improve their scaling to ensure business goals are met within expected SLA and SLO metrics.You will work closely with application development teams as advisors on improving system reliability and assisting in implementation for reliability improvements.You will improve system observability across multiple facets such as logging and metrics, reducing false alarms to eliminate unnecessary toil and improving process efficiency.You will implement chaos engin...