🌿 Back to all jobs

🥝 AI/HPC System Performance Engineer

Meta | Menlo Park, United States | Posted June 01, 2026

Job Description

**Summary:**
Meta is building some of the world's largest AI and high-performance computing infrastructure to power next-generation AI research and products. As an AI/HPC System Performance Engineer on the Network Infrastructure Engineering team, you will drive end-to-end performance characterization, bottleneck analysis, and optimization of large-scale AI training and inference clusters. In this role, you will work at the intersection of network fabric design, distributed computing, and AI workload behavior to ensure Meta's HPC systems deliver maximum throughput and efficiency for frontier model development.
**Required Skills:**
AI/HPC System Performance Engineer Responsibilities:
1. Profile and benchmark AI training and inference workloads across large-scale HPC clusters to identify network, compute, and memory bottlenecks
2. Develop and maintain performance analysis frameworks and dashboards to track system-level metrics including GPU utilization, network bandwidt...

Apply for This Position

Submit Application