A new dataset promises to accelerate the development of autonomous vehicle (AV) technology by providing researchers with a wealth of previously-unavailable real-world driving data captured from multiple vehicles over repeated trips.
The MARS (MultiAgent, multitraveRSal, and multimodal) dataset, introduced by researchers from NYU Tandon School of Engineering in collaboration with autonomous vehicle company May Mobility, offers a unique combination of features that sets it apart from previous efforts in the field.
The NYU Tandon team presented a paper on MARS last month at the IEEE / CVF Computer Vision and Pattern Recognition (CVPR) Conference, the premier annual computer vision event. The MARS dataset is publicly available.
"Datasets for autonomous vehicle research have typically come from a single vehicle's one-time pass of a certain location. MARS offers many more opportunities because it captures real interactions between multiple AVs traveling over fixed routes hundreds of times, including repeated passes through the same locations under varying conditions," said Chen Feng, the lead researcher on the project. Feng is an NYU Tandon assistant professor working on computer vision for autonomous vehicles and mobile robots, whose NSF CAREER award funded this project.
The dataset – curated by Feng's Automation and Intelligence for Civil Engineering (AI4CE) Lab and May Mobility engineers – was collected using a May Mobility fleet of four autonomous Toyota Sienna Autono-MaaS that operated within a 20-kilometer zone encompassing residential, commercial, and university areas in a U.S. city.
May Mobility's FleetAPI subscription service allows access to real-time and historical data from its vehicles. This provides data partners like NYU Tandon with access to real-world information including sensor data (LiDAR, Camera), GPS/IMU, vehicle state, and more.
"The MARS dataset allows us to study both how multiple vehicles can collaboratively perceive their surroundings more accurately, and how vehicles can build up a detailed understanding of their environment over time," said Feng. "We could not have assembled it without the unprecedented access May Mobility provided us to its large-scale real-world data. The result is a significant step towards autonomous vehicles being a safe and reliable reality on our roads. Moreover, the collaboration sets a precedent for industry-academic partnerships that benefit the entire field."
"We believe that transparency and data-sharing can do much more than help our customers, it can help the next generation of innovators to push boundaries and come up with their own big ideas," said Dr. Edwin Olson, CEO and co-founder of May Mobility. "As we continue to build bridges with academia, their research will pave the way to more innovation at May Mobility and throughout the AV industry as a whole."
NYU Tandon began planning with May Mobility in November 2022. Since then, NYU Tandon researchers worked closely with May Mobility's engineering teams to access the studied fleet group's daily operational sensor data and selected more than 1.4 million frames of synchronized sensor data. This included scenarios where multiple vehicles encountered each other on the road, providing valuable data on how autonomous vehicles might cooperate and communicate in the future.
One of the most significant aspects of MARS is its "multitraversal" nature. May Mobility's engineers and the NYU Tandon researchers identified 67 specific locations along the route and collected data from thousands of passes through these areas at different times of day and in varying weather conditions.
"This repeated observation of the same locations is crucial for developing more robust perception and mapping algorithms," said Yiming Li, the first author of this paper and a PhD student in Feng's lab who recently won the prestigious NVIDIA Graduate Fellowship. "It allows us to study how autonomous vehicles might use prior knowledge to enhance their real-time understanding of the world around them."
The release of MARS comes at a time when the autonomous vehicle industry is pushing to move beyond controlled testing environments and navigate the complexities of real-world driving.
Because the dataset is collected from multiple commercial vehicles in actual use – not from vehicles deployed expressly for data collection, from single autonomous vehicles, or from data simulations – it can play a uniquely vital role in training and validating the artificial intelligence systems that power AVs.
To demonstrate the dataset's potential, the NYU Tandon team conducted initial experiments in visual place recognition and 3D scene reconstruction. These tasks are fundamental to an AV's ability to locate itself and understand its surroundings.
"MARS is a powerful example of the very best in industry-academia collaboration. Collecting data from our real-world operations opens new avenues for autonomous driving research in collaborative perception, unsupervised learning, and high-fidelity simulations," said Dr. Fiona Hua, Director of Autonomy Perception at May Mobility. "We're just scratching the surface of what's possible with this dataset and look forward to the possibilities that develop as we work hand-in-hand with researchers to solve current and future challenges in autonomous driving."
The collaboration on and release of the MARS dataset builds on NYU Tandon's broader efforts to create safer mobility and improve the accuracy and effectiveness of autonomous driving algorithms, a commitment May Mobility shares.
Feng previously worked on a project that compiled a dataset of more than 200,000 outdoor images to test a range of visual place recognition (VPR) technologies to improve navigation in complex urban settings.
Earlier this year, NYU Tandon was selected for the National Artificial Intelligence Research Resource (NAIRR) Pilot by the U.S. National Science Foundation and Department of Energy, with another project advancing autonomous vehicle research. Led by Associate Professor Chinmay Hegde, NYU Tandon's NAIRR project will develop techniques to safely deploy advanced AI models in autonomous vehicles, focusing on systems that process both visual and textual data.
About May Mobility