[Job-offers-cs] Postdoctoral Position - Optimization & Learning for Collective Communication in Datacenters
Pekka Orponen
pekka.orponen at aalto.fi
Sat Apr 19 15:00:20 EEST 2025
Dear All,
The Operations Research Team at Huawei France Research Center
(Boulogne-Billancourt, Paris area) is opening a 12-month postdoctoral
position (with a possible 6-month extension) in the context of the ANR
project Net4AI.
The topic focuses on optimizing collective communication within
datacenters during large language model (LLM) training and inference.
Communication between GPUs during these operations is a major
bottleneck. The objective is to develop optimization and learning-based
approaches to enhance communication efficiency and reduce training time,
by addressing both offline and online decision-making challenges.
Offline, a bi-level optimization problem is to be solved for GPU
assignment, job scheduling, and routing. Online, reinforcement learning
methods will be investigated to adaptively buffer and balance tasks
based on real-time network conditions.
Keywords: Optimization, Reinforcement Learning, Datacenter Networking,
Collective Communication, LLM Training
This position is part of Huawei's research initiative on next-generation
AI infrastructure and offers the opportunity to collaborate with a
dynamic and multidisciplinary team.
Interested candidates can apply by replying to this email with a
detailed CV, a cover letter, university transcripts, and references.
Kind regards,
Dr. Youcef Magnouche
Huawei France Research Center
youcef.magnouche at huawei.com
More information about the Job-offers-cs
mailing list