🌿 Back to all jobs
🥝 Principal Engineer, Machine Learning, SMAI
1100 Micron SemiAsiaOP Pte Ltd | singapore, Singapore | Posted June 11, 2026
Job Description
Responsibilities
- Architect and execute large‑scale custom model training and fine‑tuning jobs (SFT, RLHF) on multi‑node, multi‑GPU clusters.
- Optimize training throughput and memory efficiency using distributed training strategies (FSDP, DeepSpeed, Megatron‑LM) and mixed‑precision techniques (FP16/BF16).
- Design and develop autonomous AI agents capable of multi‑step reasoning, planning, and tool execution to automate complex manufacturing workflows.
- Implement Agentic frameworks (e.g., LangChain, LangGraph, CrewAI) to orchestrate LLM interactions with internal APIs, databases, and software tools.
- Profile and debug GPU performance bottlenecks using tools like Nsight Systems or PyTorch Profiler to maximize hardware utilization.
- Build and maintain data/solution pipelines that feed machine learning models and GenAI applications.
- Design and optimize data structures in data management systems (Snowflake and Google Cloud p...