Evaluating Motion Planning Performance

Metrics, Tools, Datasets, and Experimental Design

October 23, 2022

IEEE/RSJ IROS 2022 - Room 12/J

Kyoto, Japan

Evaluating Motion Planning Performance

Motion planning research has produced a plethora of techniques with distinct strengths and weaknesses, and widespread applications in different areas of robotics, such as autonomous driving, mobile manipulation, and locomotion. However, the field largely lacks standardized datasets and performance metrics for comparison. As a result, researchers resort to developing their own ad-hoc experimental designs, which can be time-consuming, prone to bias, and narrow in scope. This matter makes direct comparison of approaches against the state-of-the-art difficult. Additionally, the integration of machine learning methods with motion planners has further increased the demand for large common training datasets with a rich distribution of problems. This workshop is concerned with challenges in providing reliable, consistent, and comparable performance evaluation of motion planning.

This workshop will bring together robotics researchers and practitioners in academia and industry interested in motion planning. The workshop program will include invited talks, panel discussions, and presentation of contributed papers through lightning talks and a poster session. The panels will examine two core topics: reproducible experimental design and informative evaluation metrics. The experimental design panel will center around dataset creation and tools for benchmarking, and will assess the current state of motion planner experimentation. The evaluation metrics panel will discuss the different performance metrics and criteria that are currently in use to evaluate planners and consider other potential metrics. In order to facilitate the above goals, we will collect and showcase a set of tools, documentation, and datasets provided by the community that will be made available on the workshop website. The documentation will include introductory guides on integrating new motion planners with existing infrastructure, creating new motion planning datasets, and using new or existing datasets for benchmarking planners.


The workshop schedule may change as we draw closer to the event.

Time (GMT+9) Session Speaker(s) Video
09:00 - 09:15 Opening remarks Organizers Video
Session 1: Reproducible Experimental Design
09:15 - 09:35 Learning Where to Trust Unreliable Dynamics Models for Motion Planning Dmitry Berenson University of Michigan
09:35 - 09:55 Evaluating Motion Planning “in-the-Loops” Xuesu Xiao George Mason University Video
09:55 - 10:15 Intro to Experiment Design for Motion Planning Anca Dragan UC Berkeley Video
10:15 - 10:35 Lessons for Benchmarking from Learning Motion Planners Adithya Murali NVIDIA Robotics Video
10:15 - 10:35 Lessons for Benchmarking from Learning Motion Planners Clemens Eppner NVIDIA Robotics Video
10:35 - 10:50 Coffee break
10:50 - 11:35 Panel on reproducible experimental design Speakers 1-4 Video
11:35 - 12:25 Lightning talks Paper presenters Video
12:25 - 13:25 Lunch
13:25 - 14:30 Poster session/Coffee Paper presenters
Session 2: Performance and Evaluation Metrics
14:30 - 14:50 Hyperparameter Optimization as a Tool for Motion Planning Algorithm Selection Mark Moll PickNik Robotics Video
14:50 - 15:10 Nonparametric Statistical Evaluation of Sampling-Based Motion Planning Algorithms Jonathan Gammell Oxford Robotics Institute Video
15:10 - 15:30 Comparable Performance Evaluation in Motion Planning under Uncertainty Hanna Kurniawati Australian National University Video
15:30 - 15:50 Benchmarking for Motion Planning Applications Andreas Orthey Realtime Robotics Video
15:50 - 16:05 Coffee break
16:05 - 16:50 Panel on performance and evaluation metrics Speakers 5-8 Video
16:50 - 17:00 Closing remarks Organizers


  • Prof. Dmitry Berenson

    Prof. Dmitry Berenson

    Learning Where to Trust Unreliable Dynamics Models for Motion Planning

    Abstract: This talk will present our recent work on using unreliable dynamics models for motion planning. Given a dynamics model, our method determines where in state-action space that model can be trusted and then plans trajectories with statistical guarantees of validity. I will end with some thoughts on how such methods can be evaluated.

    Bio: Dmitry Berenson is an Associate Professor in the Electrical Engineering and Computer Science Department of the University of Michigan. His research focuses on algorithms that allow robots to interact with the world through general-purpose learning, motion planning, and manipulation. He is interested in the entire pipeline of algorithm development, from creating novel algorithms to proving their theoretical properties, evaluating them on physical systems, and distributing them to open-source communities.

  • Prof. Xuesu Xiao

    Prof. Xuesu Xiao

    Evaluating Motion Planning “in-the-Loops”

    Abstract: When evaluating motion planning performance, most researchers treat motion planning as a standalone problem, i.e., given accurate world representations as input and reliable controllers to execute planning output, how does the planner perform based on planning specific metrics, such as optimality, convergence, clearance, time, and complexity. However, such evaluation may not directly reflect how the planner would perform when being used in a robot system or in an entire robot fleet. In this talk, we will situate motion planners in two nested loops: the inner development (or sense-plan-act) loop is usually plagued with inaccurate world representations, imperfect actuation, uncertain real-world conditions, and sometimes unmodeled human interactions; the outer deployment loop is where most motion planners interface with real-world tasks, mostly in industrial settings, and go through a sequence of testing, integration, deployment, and improvement/rollback. We will discuss why it is necessary to situate motion planning evaluation in both loops, not only from an academic perspective, but also an industrial viewpoint. We will also discuss our vision in bridging the gap between robotics researchers and practitioners when evaluating motion planning performance with the aim of jointly creating highly capable motion planners that can work in the real world at scale.

    Bio: Xuesu Xiao is an Assistant Professor in the Department of Computer Science at George Mason University. Xuesu (Prof. XX) directs the RobotiXX lab, in which researchers (XX-Men) and robots (XX-Bots) work together at the intersection of motion planning and machine learning with a specific focus on robustly deployable field robotics. Xuesu is also a Roboticist at Everyday Robots, a company born from X, the moonshot factory, working alongside teams at Google and building robots that can learn by itself to help anyone with (almost) anything.


  • Prof. Anca Dragan

    Prof. Anca Dragan

    Intro to Experiment Design for Motion Planning

    Abstract: I'll go over the basics of what experiment design is, what makes a good experiment and what to watch out for, as well as my own opinions for how to evaluate motion planning performance. I'll draw on examples from my own early grad school days, when I was working on motion planning, but learning about user studies and what principles can be ported over to evaluate algorithmic contributions in robotics. I hope it will be helpful!

    Bio: Anca Dragan is an Associate Professor in the EECS Department at UC Berkeley. Her goal is to enable robots to work with, around, and in support of people. She runs the InterACT Lab, where she focuses on algorithms for human-robot interaction. She also helped found and serve on the steering committee for the Berkeley AI Research (BAIR) Lab, and is a co-PI of the Center for Human-Compatible AI. Anca has been honored by the Sloan Fellowship, MIT TR35, the Okawa award, an NSF CAREER award, and the PECASE award.


  • Dr. Adithya Murali

    Dr. Adithya Murali

    Lessons for Benchmarking from Learning Motion Planners

    Abstract: The ingredients needed for benchmarking motion planning and those required to effectively train motion planners are very similar. In this talk, we focus on several insights we gained from our efforts to learn motion planners via large-scale simulation, which can be used to help build community-wide benchmarks.

    Bio: Adithya Murali is a research scientist ad the NVIDIA Seattle Robotics Lab. He received his PhD at The Robotics Institute, Carnegie Mellon University advised by Abhinav Gupta, where he was supported by the Uber Presidential Fellowship. During his PhD, he also worked at NVIDIA and spent time at Facebook, Amazon, LBL and NTU. His general interests are in robotic manipulation, perception and learning and also in building robot systems. (edited)


  • Dr. Clemens Eppner

    Dr. Clemens Eppner

    Lessons for Benchmarking from Learning Motion Planners

    Abstract: The ingredients needed for benchmarking motion planning and those required to effectively train motion planners are very similar. In this talk, we focus on several insights we gained from our efforts to learn motion planners via large-scale simulation, which can be used to help build community-wide benchmarks.

    Bio: Clemens Eppner is a Research Scientist in the Seattle Robotics Lab at NVIDIA Research. He is particularly interested in problems surrounding robotic grasping and manipulation, including aspects of planning, control, and perception. Before joining NVIDIA, he received his Ph.D. at the Robotics and Biology Lab at TU Berlin.


  • Dr. Mark Moll

    Dr. Mark Moll

    Hyperparameter Optimization as a Tool for Motion Planning Algorithm Selection

    Abstract: Over the years, many motion planning algorithms have been proposed. It is often unclear which algorithm might be best suited for a particular class of problems. The problem is compounded by the fact that algorithm performance can be highly dependent on parameter settings. This paper shows that hyperparameter optimization is an effective tool in both algorithm selection and parameter tuning over a given set of motion planning problems. We present different loss functions for optimization that capture different notions of optimality. The approach is evaluated on a broad range of scenes using two different manipulators, a Fetch and a Baxter. We show that optimized planning algorithm performance significantly improves upon baseline performance and generalizes broadly in the sense that performance improvements carry over to problems that are very different from the ones considered during optimization.

    Bio: Mark Moll is the Director of Research at PickNik, a robotics software development and consultancy company that is supporting the MoveIt motion planning framework. He has worked in robotics for more than 20 years, with a focus on motion planning. For most of that time he was a senior research scientist in the Computer Science Department at Rice University, where he lead the development of the Open Motion Planning Library (OMPL), which is widely used in industry and academic research (often via MoveIt / ROS). He has over 80 peer-reviewed publications with research contributions in applied algorithms for problems in robotics and computational structural biology. He has extensive experience deploying novel algorithms on a variety of robotic platforms, ranging from NASA’s Robonaut 2 to autonomous underwater vehicles and self-reconfigurable robots.


  • Dr. Jonathan Gammell

    Dr. Jonathan Gammell

    Nonparametric Statistical Evaluation of Sampling-Based Motion Planning Algorithms

    Abstract: Sampling-based algorithms are important approaches in motion planning. Their use of sampling sequences allows them to quickly find good solutions to many problems but makes their individual performance nondeterministic. Algorithms are instead evaluated by running independent trials and performing appropriate statistical analysis on the distribution of results. This talk will share best practices for statistical evaluation of anytime and nonanytime sampling-based planning algorithms. It will present performance metrics and associated nonparametric confidence intervals to quantify planner performance without making any assumptions about the underlying distribution of results. These metrics allow algorithms to be evaluated in a fair and meaningful way and implementations are available in the open-source Planner Developer Tools (PDT).

    Bio: Jonathan Gammell is a Departmental Lecturer in Robotics at the Oxford Robotics Institute (ORI). He leads the Estimation, Search, and Planning (ESP) research group which seeks to develop and exploit better understandings of fundamental robotic problems. He holds a Ph.D. and M.A.Sc. in Aerospace Science & Engineering from the University of Toronto (UTIAS) and a B.A.Sc. in Mechanical Engineering (Co-op) with a Physics Option from the University of Waterloo. Jonathan is a dedicated 'full-stack' roboticist with extensive experience solving real-world problems with robotic hardware and software. He has deployed autonomous systems around the world on a variety of projects.


  • Prof. Hanna Kurniawati

    Prof. Hanna Kurniawati

    Comparable Performance Evaluation in Motion Planning under Uncertainty

    Abstract: Uncertainty is ubiquitous and increases the challenge of creating reliable and comparable performance evaluation in motion planning. In this talk, I will review some of these challenges. I will also discuss our effort in developing a software tool to alleviate the above difficulties for motion planning under uncertainty, specifically when uncertainty is caused by non-deterministic effects of actions and errors in sensors and sensing.

    Bio: Hanna Kurniawati is an Associate Professor with ANU and CS Futures Fellowship at the Research School of Computer Science, Australian National University (ANU), as well as a CI and exec of a major interdisciplinary project Humansing Machine Intelligence. Her research focuses on algorithms to enable robust decision theory to become practical software tools, with applications in robotics and the assurance of autonomous systems. Such software tools will enable robots to design their own strategies, such as deciding what data to use, how to gather the data, and how to move, for accomplishing various tasks well, despite various modelling errors and types of uncertainty, and despite limited to no information about the system and its operating environment.


  • Dr. Andreas Orthey

    Dr. Andreas Orthey

    Benchmarking for Motion Planning Applications

    Abstract: Benchmarking is often thought to be a crucial component to evaluate the goodness of a given motion planning algorithm. However, most motion planning applications require different benchmarking frameworks and tools to properly evaluate them. I will showcase this on three different motion planning applications in spot welding, bin picking, and rearrangement planning. Because those applications have different requirements, we can observe not only different planning behavior, but we can often significantly improve upon existing planners by designing tailor-made solutions for specific benchmark data sets. This has not only ramifications for how we should design a benchmark dataset, but also for the development of novel planning algorithms.

    Bio: Andreas Orthey is a staff robotics scientist at Realtime Robotics. He was previously a postdoctoral researcher with Marc Toussaint at the Max Planck Institute for Intelligent Systems (MPI-IS).


Workshop Papers

Arena-Bench: A Benchmarking Suite for Obstacle Avoidance Approaches in Highly Dynamic Environments

PDF / Code

Linh Kästner, Teham Bhuiyan, Tuan Anh Le, Elias Treis, Boris Meinardus Johannes Cox, Bassel Fatloun, Niloufar Khorsandi, and Jens Lambrecht

Towards Reliable Benchmarking for Multi-Robot Planning in Realistic, Cluttered and Complex Environments

PDF / Code

Luigi Palmieri, Simon Schaefer, Lukas Heuer, Niels van Duijkeren, Alexandre Kleiner, Ruediger Dillmann, and Sven Koenig

Local Planner Bench: Benchmarking for Local Motion Planning

PDF / Code

Max Spahn, Chadi Salmi, and Javier Alonso-Mora

Benchmarking Sampling-, Search-, and Optimization-based Approaches for Time-Optimal Kinodynamic Mobile Robot Motion Planning

PDF / Code

Wolfgang Hoenig, Joaquim Ortiz-Haro, and Marc Toussaint

HuNavSim: A ROS2 Human Navigation Simulator for Benchmarking Human-Aware Robot Navigation

PDF / Code

Noé Pérez-Higueras, Luis Merino, Fernando Caballero, and Roberto Otero

The MiniCity: A 1/10th Scale Evaluation Platform for Testing Autonomous Urban Perception and Planning


Noam Buckman, Alex Hansen, Sertac Karaman, and Daniela Rus

A Clinical Dataset for the Evaluation of Motion Planners in Medical Applications

PDF / Code

Inbar Fried, Ron Alterovitz, and Jason Akulian

Datasets and Benchmarking of a path planning pipeline for planetary rovers

PDF / Code

Mallikarjuna Vayugundla, Moritz Kuhne, Armin Wedler, and Rudolph Triebel

Planner Developer Tools (PDT): Reproducible Experiments and Statistical Analysis for Developing and Testing Motion Planners

PDF / Code

Jonathan Gammell, Marlin Strub, and Valentin N Hartmann

Towards Rich, Portable, and Large-Scale Pedestrian Data Collection

PDF / Code

Allan Wang, Abhijat Biswas, Henny Admoni, and Aaron Steinfeld

Pedestrian-Robot Interactions on Autonomous Crowd Navigation: Dataset and Metrics

PDF / Code

Diego Paez Granados, Yujie He, David Gonon, Lukas Huber, and Aude Billard

The MoveIt Benchmark Suite for Whole-Stack Planner Evaluation

PDF / Code

Michael Görner, David Pivin, Francois Michaud, and Jianwei Zhang

Towards Benchmarking Sampling-Based Kinodynamic Motion Planners with ML4KP

PDF / Code

Edgar Granados, Aravind Sivaramakrishnan, and Kostas Bekris

Evaluating Guiding Spaces for Motion Planning


Amnon D Attali, Stav Ashur, Isaac Burton Love, Courtney McBeth, James Motes, Diane Uwacu, Marco Morales, and Nancy Amato


OMPL: The Open Motion Planning Library

PDF / Code

Ioan A. Şucan, Mark Moll, and Lydia E. Kavraki

Robowflex: Robot Motion Planning with MoveIt Made Easy

PDF / Code

Zachary Kingston, and Lydia E. Kavraki

MotionBenchMaker: A Tool to Generate and Benchmark Motion Planning Datasets

PDF / Code

Constantinos Chamzas, Carlos Quintero-Peña, Zachary Kingston, Andreas Orthey, Daniel Rakita, Michael Gleicher, Marc Toussaint, and Lydia E. Kavraki

PlannerArena: Benchmarking Motion Planning Algorithms

PDF / Code

Mark Moll, Ioan A. Şucan, and Lydia E. Kavraki

BARN: Benchmarking Metric Ground Navigation

PDF / Code

Daniel Perille, Abigail Truong, Xuesu Xiao, and Peter Stone

DynaBARN: Benchmarking Metric Ground Navigation in Dynamic Environments

PDF / Code

Anirudh Nair, Fulin Jiang, Kang Hou, Zifan Xu, Shuozhe Li, Xuesu Xiao, and Peter Stone

Bench-MR: A Motion Planning Benchmark for Wheeled Mobile Robots

PDF / Code

Eric Heiden, Luigi Palmieri, Leonard Bruns, Kai O. Arras, Gaurav S. Sukhatme, and Sven Koenig

HyperPlan: Motion Planning Hyperparameter Optimization

PDF / Code

Mark Moll, Constantinos Chamzas, Zachary Kingston, and Lydia E. Kavraki

On-line POMDP Planning Toolkit (OPPT)

PDF / Code

Marcus Hoerger, Hanna Kurniawati, and Alberto Elfes


We have received endorsements from the following technical committees

IEEE RAS TC on Software Engineering for Robotics and Automation

IEEE RAS TC on Mobile Manipulation

IEEE RAS TC on Verification of Autonomous Systems

IEEE RAS TC on Algorithms for Planning and Control of Robot Motion

IEEE RAS TC on Performance Evaluation & Benchmarking of Robotic and Automation Systems



Please contact the organizers at motionplanningworkshop@gmail.com with any questions.