🌿 Back to all jobs
🥝 Senior Site Reliability Engineer
EPAM Systems, Inc. | desde casa, Mexico | Posted June 05, 2026
Job Description
Join our team as a **Senior Site Reliability Engineer** focused on delivering advanced support for critical Azure-based systems.
**Responsibilities**
- Troubleshoot and resolve complex incidents to maintain system uptime
- Ensure reliability and performance of Azure-based enterprise infrastructure
- Implement observability, monitoring, and logging solutions
- Automate infrastructure provisioning and deployment using Terraform and scripting
- Optimize system performance and uptime through proactive monitoring and alerting
- Collaborate with cross-functional teams to improve service reliability
- Conduct root cause analysis and postmortems for incident management
- Manage deployment pipelines in Azure DevOps for secure and scalable workflows
- Develop and maintain automation scripts for routine tasks and incident recovery
- Enhance monitoring frameworks with tools like Prometheus and Grafana
- React quickly to incidents to avoid SLA degradation
- In...