Papers

ODIN: An OmniDirectional INdoor Dataset Capturing Activities of Daily Living From Multiple Synchronized Modalities

We present ODIN, a large-scale multi-modal dataset for human behavior understanding using top-view omnidirectional cameras. It features real-life indoor scenarios with synchronized data like RGB, infrared, and depth images, egocentric videos, physiological signals, and 3D scans. Notably, ODIN offers camera-frame 3D human pose estimates for omnidirectional images, a first in the field.

A new benchmark for group distribution shifts in hand grasp regression for object manipulation. Can meta-learning raise the bar?

Computer vision in hand-object pose has diverse applications. Current methods on balanced datasets may not perform well in real-world scenarios. We introduce a benchmark for handling pose distribution shifts and propose meta-learning for adaptation. Results improve over the baseline, but face optimization challenges. Our analysis guides future benchmark work.

Image generation for efficient neural network training in autonomous drone racing

Autonomous drone racing faces challenges with traditional gate detection due to varying conditions. This work proposes a semi-synthetic dataset combining real backgrounds and 3D renders for training convolutional neural networks for gate detection.