AI Infrastructure Engineer
BMW Group
Munich, Germany
What awaits you?
- You will architect cloud and hardware solutions to scale AI workloads across GPUs and accelerators, optimizing storage and networking for maximum throughput and cost control.
- Furthermore, you will engineer and operate end-to-end AI systems, focusing on fine-tuning and scalable serving of modern AI models.
- You will support the MLOps layer by building tools for reliable deployment, real-time monitoring, and the continuous improvement of models throughout their lifecycle.
- You will design and build scalable data pipelines and "flywheels" that ensure high-quality data availability and enable efficient feedback loops for continuous learning.
- Additionally, you will design robust AI services and lead their integration into production-ready platforms, ensuring stability and performance.
What should you bring along?
- Bachelor´s or Master’s degree in Computer Science, Systems Engineering, or equivalent practical experience.
- Hands-on experience with public cloud providers (AWS and Azure) and managing high-performance GPU clusters (e.g. configuring NVIDIA drivers, CUDA versions, and interconnects like InfiniBand/NCCL).
- Strong programming skills in Python with focus on building infrastructure-as-code environments, APIs, CI/CD pipelines and observability tools.
- ML Framework proficiency; operational knowledge of PyTorch specifically how to optimize them for distributed training (using tools like Ray and Slurm).
- Open source contributions or relevant certifications are a plus.
Don't forget to mention EuroTechJobs when applying.