Research

All intelligence is collective intelligence.

Our aim is to extend single robot capabilities through collaborative strategies. We develop data-driven approaches to hard coordination problems — from teaching agents how to communicate and cooperate, to co-designing the environments they work in — and transfer these policies from simulation to real robots for fast, online decision-making.

Themes

Agile, dense aerial swarms

Dense, agile swarm flight with downwash control, cooperative docking, scalable kinodynamic planning, and outdoor deployment.

Co-design of agents and environments

Making robots smarter by also redesigning the spaces they work in — discovering that adding obstacles and embedding sensors can dramatically improve coordination.

Compositionality

Moving beyond monolithic models toward modular, specialized agents that combine at runtime for flexible and scalable multi-robot intelligence.

Decentralized coordination

Scaling multi-agent coordination from two robots to hundreds — without central planners, and transferring those policies from simulation onto physical teams.

Learned communication

Differentiable inter-agent communication — letting robots learn what to share with their teammates directly from the downstream task.

MARL and GNN foundations

Building the algorithmic foundations for multi-agent learning — graph neural network architectures, scalable MARL algorithms, and principled tools for controlling team diversity.

Platforms and tools

Open-source simulators, benchmarking libraries, and custom hardware platforms — VMAS, BenchMARL, RoboMaster, Raven, and Sanity.

Robot-task co-optimization

Making robots smarter by also redesigning the spaces they work in — discovering that adding obstacles and embedding sensors into the environment can dramatically improve coordination.

VIDEO

Lab Demos

All videos

3:12

2026

Mar 16

PAPER npj Robotics
Concrete multi-agent path planning enabling kinodynamically aggressive maneuvers
Keisuke Okumura, Guang Yang, Zhan Gao, Heedo Woo, Amanda Prorok

We present concrete planning, a hybrid approach that captures real-world continuous dynamics while maintaining scalable guaranteed planning via discrete search. The framework integrates advances in robot dynamics learning, optimal control, and anytime complete planning into a modular system deployed with 40 robots — 20 aerial, 8 ground, and 12 obstacle robots — operating in a compact laboratory space.

pdf code video bibtex · Agile, dense aerial swarms
Jan 20

PAPER ICLR 2026
When Is Diversity Rewarded in Cooperative Multi-Agent Learning?
Michael Amir, Matteo Bettini, Amanda Prorok

An investigation of when and why behavioral diversity benefits cooperative multi-agent learning, establishing conditions under which heterogeneous policies outperform homogeneous ones.

pdf bibtex · MARL and GNN foundations
Jan 20

PAPER ICLR 2026
Pairwise is Not Enough: Hypergraph Neural Networks for Multi-Agent Pathfinding
Rishabh Jain, Keisuke Okumura, Michael Amir, Pietro Liò, Amanda Prorok

Pairwise interactions captured by standard GNNs miss higher-order dependencies in multi-agent pathfinding. This work introduces hypergraph neural networks that model group-level constraints, improving solution quality for dense navigation scenarios.

pdf code bibtex · MARL and GNN foundations
Jan 20

PAPER ICLR 2026
Remotely Detectable Robot Policy Watermarking
Michael Amir, Manon Flageat, Amanda Prorok

A method for embedding remotely detectable watermarks into robot policies, enabling verification of policy provenance without requiring access to the model parameters.

pdf bibtex · MARL and GNN foundations
Nov 2025

PAPER AAAI 2026
Graph Attention-Guided Search for Dense Multi-Agent Pathfinding
Rishabh Jain, Keisuke Okumura, Michael Amir, Amanda Prorok

Historically, learning-based approaches to multi-agent pathfinding have struggled to outperform classical search-based methods. In this work, we introduce a hybrid approach that combines GNNs with a classical search algorithm, to achieve superior performance to both purely learning-based and classical methods.

pdf code bibtex · MARL and GNN foundations

2025

Dec 2025

PAPER NeurIPS 2025
No-Regret Thompson Sampling for Finite-Horizon Markov Decision Processes with Gaussian Processes
Jasmine Bayrooti, Sattar Vakili, Amanda Prorok, Carl Henrik Ek

A no-regret Thompson sampling algorithm for finite-horizon Markov decision processes using Gaussian process models, providing efficient exploration in model-based reinforcement learning settings.

pdf bibtex · MARL and GNN foundations
Sep 2025

PAPER Science Robotics
Extending robot minds through collective learning
Amanda Prorok

The current trend toward generalist robot behaviors powered by massive, monolithic AI models is unsustainable. This viewpoint article argues for a paradigm shift toward collective robotic intelligence — a "mixture-of-robots" approach where diverse, specialized agents learn and work together, inspired by natural systems where cooperation among specialized components leads to greater intelligence and resilience.

pdf bibtex · MARL and GNN foundations
Aug 2025

PAPER CoRL 2025
ReCoDe: Reinforcement Learning-based Dynamic Constraint Design for Multi-Agent Coordination
Michael Amir, Guang Yang, Zhan Gao, Keisuke Okumura, Heedo Woo, Amanda Prorok

A hybrid, decentralized framework combining optimization-based control with adaptation provided by MARL. Rather than discarding expert controllers, ReCoDe improves them by learning additional dynamic constraints that capture subtler behaviors — for example, constraining agent movements to prevent congestion in cluttered scenarios.

pdf video bibtex · Robot-task co-optimization
Jul 2025

PAPER IROS 2025
D4orm: Multi-Robot Trajectories with Dynamics-aware Diffusion Denoised Deformations
Yixiao Zhang, Keisuke Okumura, Heedo Woo, Ajay Shankar, Amanda Prorok

An optimization method for generating kinodynamically feasible and collision-free multi-robot trajectories that exploits an incremental denoising scheme from diffusion models. Evaluated for differential-drive and holonomic teams with up to 16 robots in 2D and 3D worlds.

pdf code video bibtex · Agile, dense aerial swarms
May 2025

PAPER ICRA 2025
DVM-SLAM: Decentralized Visual Monocular Simultaneous Localization and Mapping for Multi-Agent Systems
Joshua Bird, Jan Blumenkamp, Amanda Prorok

A decentralized visual monocular SLAM system for multi-agent systems, enabling collaborative mapping and localization without centralized infrastructure.

pdf bibtex · Agile, dense aerial swarms
May 2025

PAPER ICRA 2025
Language-Conditioned Offline RL for Multi-Robot Navigation
Steven Morad, Ajay Shankar, Jan Blumenkamp, Amanda Prorok

Offline reinforcement learning approach conditioned on natural language commands for decentralized multi-robot navigation, bridging high-level human instructions and low-level robot coordination policies.

pdf bibtex · MARL and GNN foundations
Apr 2025

PAPER ICLR 2025
Efficient Model-Based Reinforcement Learning Through Optimistic Thompson Sampling
Jasmine Bayrooti, Carl Henrik Ek, Amanda Prorok

An efficient model-based reinforcement learning method using optimistic Thompson sampling to balance exploration and exploitation in complex sequential decision-making problems.

pdf bibtex · MARL and GNN foundations
Mar 2025

PAPER IEEE Transactions on Robotics
Co-Optimizing Reconfigurable Environments and Policies for Decentralized Multi-Agent Navigation
Zhan Gao, Guang Yang, Amanda Prorok

Treats the environment as a co-decision variable alongside agent policies, jointly optimizing both the physical layout and the decentralized navigation policies for multi-agent systems.

pdf bibtex · Robot-task co-optimization

2024

Dec 2024

PAPER JMLR
BenchMARL: Benchmarking Multi-Agent Reinforcement Learning
Matteo Bettini, Amanda Prorok, Vincent Moens

A standardized library for benchmarking multi-agent reinforcement learning, enabling seamless mixing and matching of MARL algorithms, tasks, and models while maintaining rigorous reproducibility and standardization.

pdf code bibtex · Platforms and tools
Nov 2024

PAPER CoRL 2024
Provably Safe Online Multi-Agent Navigation in Unknown Environments
Zhan Gao, Guang Yang, Jasmine Bayrooti, Amanda Prorok

A provably safe online method for multi-agent navigation in unknown environments, combining control barrier functions with decentralized planning to guarantee collision avoidance while maintaining liveness.

pdf bibtex · MARL and GNN foundations
Oct 2024

PAPER CoRL 2024
CoViS-Net: A Cooperative Visual Spatial Foundation Model for Multi-Robot Applications
Jan Blumenkamp, Steven Morad, Jennifer Gielis, Amanda Prorok

A decentralized visual spatial foundation model enabling real-time, platform-agnostic pose estimation and spatial comprehension for autonomous robots. Provides accurate pose estimates and local bird's-eye-view without needing camera overlap, deployed on wheeled platforms and a quadruped.

pdf code video bibtex · Agile, dense aerial swarms
Jul 2024

PAPER ICML 2024
Controlling Behavioral Diversity in Multi-Agent Reinforcement Learning
Matteo Bettini, Ryan Kortvelesy, Amanda Prorok

DiCo (Diversity Control) controls behavioral diversity to an exact value of a given metric by representing policies as the sum of a parameter-shared component and dynamically scaled per-agent components. Applied directly to the policy architecture, leaving the learning objective unchanged.

pdf code bibtex · MARL and GNN foundations
May 2024

PAPER DARS 2024
The Cambridge RoboMaster: An Agile Multi-Robot Research Platform
Jan Blumenkamp, Steven Morad, Jennifer Gielis, Amanda Prorok

An agile multi-robot research platform built on DJI RoboMaster hardware, designed for real-world experiments in decentralized multi-agent coordination, supporting GNN-based policy deployment and sim-to-real transfer.

pdf bibtex · Platforms and tools

2023

Dec 2023

PAPER NeurIPS 2023
Generalised f-Mean Aggregation for Graph Neural Networks
Ryan Kortvelesy, Steven Morad, Amanda Prorok

A generalized f-mean aggregation framework for graph neural networks that subsumes common aggregation functions (sum, mean, max) as special cases while enabling learnable, task-adaptive aggregation strategies.

pdf bibtex · MARL and GNN foundations
Dec 2023

PAPER NeurIPS 2023
Reinforcement Learning with Fast and Forgetful Memory
Steven Morad, Ryan Kortvelesy, Stephan Liwicki, Amanda Prorok

A new memory model for RL that serves as a super-efficient plug-in replacement for RNNs and transformers. Runs up to two orders of magnitude faster with linear space complexity, and sets new records on the POPGym benchmark for partially observable RL.

pdf code bibtex · MARL and GNN foundations
May 2023

PAPER AAMAS 2023
Heterogeneous multi-robot reinforcement learning
Matteo Bettini, Ajay Shankar, Amanda Prorok

We study cooperative multi-robot tasks where the team is composed of agents with structurally different observations, action spaces, and reward functions. We introduce HetGPPO, a parameter-sharing paradigm enabling heterogeneous behaviors in multi-agent reinforcement learning with graph neural networks, achieving superior performance over role-blind baselines.

pdf code video bibtex · MARL and GNN foundations
May 2023

PAPER ICLR 2023
POPGym: Benchmarking Partially Observable Reinforcement Learning
Steven Morad, Ryan Kortvelesy, Matteo Bettini, Stephan Liwicki, Amanda Prorok

A comprehensive benchmark suite for partially observable reinforcement learning, providing a unified evaluation framework across diverse POMDP environments to compare memory-based RL architectures.

pdf code bibtex · Platforms and tools

This page highlights recent work. For the complete record going back to 2008:

Google Scholar