🌿 Back to all jobs

🥝 Senior HPC Site Reliability Engineer

NVIDIA | Yokneam, Israel | Posted May 27, 2026

Job Description

We are now looking for a Senior HPC Site Reliability Engineer to join our mission and continue improving our HPC infrastructure. A meaningful part of NVIDIA’s strength is our unique and advanced development tools and environments that enable our incredible pace of innovation. We are looking for architects to help us evolve the way our private compute cloud is architected and optimized.
What you will be doing:

+ Provide leadership in the design and implementation of our large-scale compute cloud that enables the world's top chip modelers, designers, and deep learning experts to invent groundbreaking technology.

+ Identify architectural changes or completely innovative approaches in our cloud architecture and design.

+ Help with strategic challenges we encounter, including: effective resource utilization in a heterogeneous compute environment, evolving our private/public cloud strategy, capacity modeling, and planning for multi-year growth and scaling acr...

Apply for This Position

Submit Application