🌿 Back to all jobs

🥝 DevOps & SRE Engineer

Manus AI | singapore, Singapore | Posted June 05, 2026

Job Description

Key Responsibilities

  • Cluster Operations & Management: Manage and maintain container clusters (Kubernetes, Docker) and open-source component clusters (Kafka, Redis, Elasticsearch) across multiple business units

  • Ensure optimal performance, scalability, and reliability of distributed systems

  • Infrastructure Platform Development: Design, build, and enhance infrastructure operation platforms

  • Develop and maintain systems for infrastructure management, CI/CD pipelines, monitoring/alerting, and centralized logging

  • Drive platform standardization and automation initiatives

  • High Availability & Reliability: Ensure maximum uptime for production services through proactive monitoring and incident response

  • Continuously optimize service architecture, deployment strategies, and operational processes

  • Implement and maintain SLA/SLO frameworks and reliability...

Apply for This Position

Submit Application