Challenge objectives
- Panoptic segmentation tasks under clear and adverse conditions.
- Evaluation on a Official Score emphasizing robustness and generalization.
- Availability of 4 different sensor modalities (RGB, LiDAR, Radar, Event).
Introduction
Current perception pipelines for automated driving deliver strong performance in clear-weather conditions, yet they still struggle under adverse-to-extreme scenarios. This challenge targets panoptic segmentation on the MUSES multi-sensor dataset across clear and adverse-to-extreme conditions (rain, fog, snow; day & night).
Leveraging multiple sensors is crucial for reliable perception in autonomous driving, as each modality offers complementary strengths and weaknesses.
Dataset Description
The challenge is based on MUSES, a multi-sensor dataset designed for dense 2D panoptic segmentation under varying levels of uncertainty. It provides synchronized recordings from several complementary modalities, including an RGB frame camera, a lidar, a radar, an event camera, and GNSS/IMU signals. This combination enables studying robust fusion strategies and understanding how different sensors compensate for each other under difficult conditions.
MUSES covers a wide range of environmental conditions, including clear weather, rain, fog, and snow, as well as both daytime and nighttime scenes. While clear-weather data is included, the official scoring system focuses primarily on adverse-to-extreme situations, reflecting the need for resilient perception in real-world autonomous driving scenarios.
The dataset is organized into multiple directories—frame camera, lidar, radar, event camera, panoptic ground truth, calibration files, reference frames, and more. During the Validation Phase, input data is fully available while ground truth may be partially released, whereas in the Test Phase all labels are hidden and performance is assessed through the submission interface.
Additional details—file specifications, complete directory structure, download instructions, and dataset statistics—can be found on the Codabench challenge page: https://codabench.org/competitions/13395/
Evaluation Metrics
We introduce a new Official Score designed to reward models that perform strongly under adverse-weather conditions while still maintaining competitive accuracy in clear scenarios. To support this, we extend the standard PQ, RQ, and SQ metrics into their weather-aware counterparts — wPQ, wRQ, wSQ — each emphasizing performance in challenging situations. These metrics weight adverse-weather samples more heavily to better reflect real-world robustness. The final ranking on the leaderboard is determined exclusively by the wPQ score.
Challenge Deadlines
Phase 1 — Validation
February 6 - March 3, 2026
Training and validation data is released. The validation server becomes active.
Phase 2 — Test
March 4-10, 2026
Participants run their final models on the public test data. Only three submissions are allowed.
Phase 3 — Paper Submission
March 30, 2026
Teams submit papers describing their approaches on the challenge report.