We tackle the task of Ego-Exo Object Correspondence which is recently proposed in Ego-Exo4D. Given object queries from one perspective (e.g., ego view), the task involves predicting the corresponding object masks in another perspective (e.g., exo view). Solving this task unlocks new possibilities in VR and Robotics, e.g., enabling virtual agents or robots to manipulate ego-view actions by learning from exo-view demonstrations.
Despite the importance of this task, most existing segmentation models (e.g., Mask2Former, SAM, LISA) operate on single-view inputs, making them nontrival for it. To address this, we:
Ego2Exo is used as an example in the frameork. Our method builds on the PSALM baseline (pink blocks) and tailors it for Ego-Exo Object Correspondence with two novel modules: Multimodal Condition Fusion (MCFuse) and Cross-View Object Alignment (XObjAlign). More details please refer to our paper.
We highlight that: 1) Results are reported on Val set due to the lack of GT of testing set. 2) We construct a "Small TrainSet"(1/3 data) and "Full TrainSet". Splits are released for the community which are especially friendly to the GPU/Storage limited groups. 3) Our Method clearly outperforms baselines and competitors.
Visulization results show that: 1) Our MCFuse enhances object localization ability by using text as an extra prompt; 2) Our XObjAlign improves model's performance upon huge view shift.
More: We also adapt HANDAL-X, a benchmark featuring robot-friendly objects, as an additional testbed for cross-view object segmentation. For detailed results and more visualizations, please refer to our paper.
@article{fu2024objectrelator,
title={Objectrelator: Enabling cross-view object relation understanding in ego-centric and exo-centric videos},
author={Fu, Yuqian and Wang, Runze and Bin, Ren and Guolei, Sun and Gong, Biao and Fu, Yanwei and Paudel, Danda Pani and Huang, Xuanjing and Van Gool, Luc},
journal={ICCV2025},
year={2025}
}
@article{fu2025cross,
title={Cross-View Multi-Modal Segmentation@ Ego-Exo4D Challenges 2025},
author={Fu, Yuqian and Wang, Runze and Fu, Yanwei and Paudel, Danda Pani and Van Gool, Luc},
journal={arXiv preprint arXiv:2506.05856},
year={2025}
}