Workshop Scope
URVIS investigates how unified perception can emerge from heterogeneous sensing. We explore architectures and training strategies that allow modalities to guide, compensate, or regularize one another under partial or missing signals.
RGB, depth, LiDAR, event, audio, tactile cues.
Dynamic weighting by reliability and context.
Few calibrated sensors vs. many redundant ones.
Inference under dropouts, occlusions, or async data.
Call for Papers
We invite original research contributions on multisensor and multimodal perception, learning, and reasoning for robotics and embodied AI. The workshop aims to bring together researchers working on the integration, alignment, and exploitation of heterogeneous sensory signals for robust and intelligent robotic systems.
Topics of interest include, but are not limited to:
- Multisensor and multimodal data fusion (early, late, hybrid, adaptive fusion)
- Cross-modal alignment, correspondence, and synchronization
- Multimodal representation learning and embedding spaces
- Vision-based perception for robotics and autonomous systems
- RGB-D, RGB-T, event-based, LiDAR, radar, audio, tactile, and proprioceptive sensing
- Event cameras, neuromorphic sensing, and spiking neural networks
- Multimodal foundation models and large-scale pretraining
- Self-supervised, weakly supervised, and unsupervised multimodal learning
- Multimodal tracking, detection, segmentation, and 3D understanding
- Sensor calibration, registration, and geometric consistency
- Uncertainty modeling, probabilistic fusion, and reliability-aware perception
- Learning under missing, degraded, or noisy modalities
- Cross-domain and cross-sensor generalization
- Online, continual, and lifelong multimodal learning
- Multimodal perception for manipulation, navigation, and human-robot interaction
- Multimodal SLAM, mapping, and localization
- Multimodal perception for autonomous driving and mobile robots
- Efficient, real-time, and resource-aware multimodal systems
- Simulation-to-real transfer and multimodal domain adaptation
- Benchmark datasets, evaluation protocols, and reproducible research
We welcome regular paper submissions following the official CVPR Workshops submission pipeline. Accepted papers will be published in the CVPRW Proceedings and must comply with the standard CVPRW formatting guidelines.
All submissions will undergo a double-blind peer-review process to ensure fairness, rigor, and impartiality.
Challenges
Adverse-to-the-eXtreme Panoptic Segmentation
Multisensory (RGB, LiDAR, radar, event, IMU/GNSS) segmentation under clear and adverse conditions using MUSES data.
Object Tracking in the Real World
Train with partial pairing; deploy with full or missing modalities.
TBD
Challenge pending confirmation
Schedule
| Time | Event |
|---|---|
| 08:30 | Welcome and Opening Remarks |
| 08:40 | Challenge Report 1 |
| 08:50 | Challenge Report 2 |
| 09:00 | Invited Talk 1 |
| 09:30 | Invited Talk 2 |
| 10:00 | Coffee Break |
| 10:20 | Invited Talk 3 (TBD) |
| 10:30 | Oral Presentation 1 |
| 10:40 | Oral Presentation 2 |
| 10:50 | Challenge Report 3 |
| 11:00 | Invited Talk 4 |
| 11:30 | Invited Talk 5 |
| 12:00 | Awards and Closing Remarks |
Speakers
Organizers
Sponsors
Special Issue
Based on the paper quality, extended submissions might be invited to an Elsevier Computer Vision and Image Understanding (CVIU) special issue.