Research theme
MARL and GNN foundations
Building the algorithmic foundations for multi-agent learning — graph neural network architectures, scalable MARL algorithms, and principled tools for controlling team diversity.
We build the methodological foundations that make multi-agent learning work at scale. On the architectural side, we design graph neural networks that let agents reason over their teammates as a structured, variable-sized neighborhood rather than a flat observation — supporting generalization to team sizes and topologies unseen at training time. Our ModGNN framework and generalized f-mean aggregation formalize how information should be combined across these neighborhoods.
On the learning side, we develop MARL algorithms that tackle the exploration, credit assignment, and non-stationarity problems that make multi-agent learning hard, and we package them into BenchMARL, a standardized library used across the community to benchmark MARL methods under a common interface. A recurring theme is behavioral diversity: when and why identical agents benefit from learning distinct roles, and how to control that diversity to a target value rather than leaving it to chance.
Recent papers
- Controlling Behavioral Diversity in Multi-Agent Reinforcement Learning ICML 2024 · 2024
- Extending robot minds through collective learning Science Robotics · 2025
- Generalised f-Mean Aggregation for Graph Neural Networks NeurIPS 2023 · 2023
- Reinforcement Learning with Fast and Forgetful Memory NeurIPS 2023 · 2023
- Heterogeneous multi-robot reinforcement learning AAMAS 2023 · 2023
- Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling ICLR 2025 · 2025
- When Is Diversity Rewarded in Cooperative Multi-Agent Learning? ICLR 2026 · 2026
- Pairwise is Not Enough: Hypergraph Neural Networks for Multi-Agent Pathfinding ICLR 2026 · 2026
- Remotely Detectable Robot Policy Watermarking ICLR 2026 · 2026
- Graph Attention-Guided Search for Dense Multi-Agent Pathfinding AAAI 2026 · 2026
- Language-Conditioned Offline RL for Multi-Robot Navigation ICRA 2025 · 2025
- No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes NeurIPS 2025 · 2025
- Provably Safe Online Multi-Agent Navigation in Unknown Environments CoRL 2024 · 2024
People
- Amanda Prorok
- Antonio Marino
- Jan Blumenkamp
- Jasmine Bayrooti
- Lorenzo Magnino
- Maksymilian Wolski
- Manon Flageat
- Mateusz Odrowaz-Sypniewski
- Michael Amir
- Rishabh Jain