In this role, you will contribute to the design, development, and optimization of high-performance GPU-accelerated algorithms for advanced numerical methods and computer vision applications. You will work closely with domain experts and engineering teams to translate complex mathematical models into efficient, scalable, and vector-friendly solutions, moving ideas from Python prototypes to production-grade C++ implementations. You will play an important role in improving application performance on modern NVIDIA GPU architectures, influencing technical decisions around CUDA kernel optimization, mixed-precision computing, task-based runtimes, and multi-GPU execution.