Adverse-to-the-eXtreme Panoptic Segmentation

Challenge objectives

Panoptic segmentation tasks under clear and adverse conditions.
Evaluation on a Official Score emphasizing robustness and generalization.
Availability of 4 different sensor modalities (RGB, LiDAR, Radar, Event).

Introduction

Current perception pipelines for automated driving deliver strong performance in clear-weather conditions, yet they still struggle under adverse-to-extreme scenarios. This challenge targets panoptic segmentation on the MUSES multi-sensor dataset across clear and adverse-to-extreme conditions (rain, fog, snow; day & night).

Why Robustness Matters

Leveraging multiple sensors is crucial for reliable perception in autonomous driving, as each modality offers complementary strengths and weaknesses.

Dataset Description

The challenge is based on MUSES, a multi-sensor dataset designed for dense 2D panoptic segmentation under varying levels of uncertainty. It provides synchronized recordings from several complementary modalities, including an RGB frame camera, a lidar, a radar, an event camera, and GNSS/IMU signals. This combination enables studying robust fusion strategies and understanding how different sensors compensate for each other under difficult conditions.

MUSES covers a wide range of environmental conditions, including clear weather, rain, fog, and snow, as well as both daytime and nighttime scenes. While clear-weather data is included, the official scoring system focuses primarily on adverse-to-extreme situations, reflecting the need for resilient perception in real-world autonomous driving scenarios.

The dataset is organized into multiple directories—frame camera, lidar, radar, event camera, panoptic ground truth, calibration files, reference frames, and more. During the Validation Phase, input data is fully available while ground truth may be partially released, whereas in the Test Phase all labels are hidden and performance is assessed through the submission interface.

Additional details—file specifications, complete directory structure, download instructions, and dataset statistics—can be found on the Codabench challenge page: https://codabench.org/competitions/13395/

Evaluation Metrics

We introduce a new Official Score designed to reward models that perform strongly under adverse-weather conditions while still maintaining competitive accuracy in clear scenarios. To support this, we extend the standard PQ, RQ, and SQ metrics into their weather-aware counterparts — wPQ, wRQ, wSQ — each emphasizing performance in challenging situations. These metrics weight adverse-weather samples more heavily to better reflect real-world robustness. The final ranking on the leaderboard is determined exclusively by the wPQ score.

Challenge Deadlines

Phase 1 — Validation

February 6 - March 3, 2026

Training and validation data is released. The validation server becomes active.

Phase 2 — Test

March 4-10, 2026

Participants run their final models on the public test data. Only three submissions are allowed.

Phase 3 — Paper Submission

March 30, 2026

Teams submit papers describing their approaches on the challenge report.

Note: Our submission and final leaderboard deadlines are aligned with the official publication schedule of the ECCV conference.

Organizers

Jocelyn Wang

ETH Zurich

Peyratout Nolwenn

I3S - Université Côte d'Azur

Broedermann Tim

ETH Zurich

Zongwei Wu

University of Würzburg

Christos Sakaridis

ETH Zurich