Job Description
- Ensure system reliability, stability, and performance of production environments. - Handle incident management, change management, and problem management activities. - Perform root cause analysis and provide permanent fixes for system issues. - Develop automation scripts using Python or other scripting languages to reduce manual effort. - Monitor production systems using tools like Splunk, AppDynamics, or Datadog. - Support deployment, release management, and environment configuration for UAT and production. - Collaborate with development, database, network, and infrastructure teams for issue resolution. - Maintain documentation for support processes, troubleshooting guides, and release procedures. - Manage L2/L3 production support activities and ensure SLA adherence. - Work on continuous improvement of system reliability and operational efficiency. - Participate in on-call support during weekdays and weekends when required.