James Ross
Academic and research departments
Centre for Vision, Speech and Signal Processing (CVSSP), Faculty of Engineering and Physical Sciences.About
My research project
Computer vision/deep learning for autonomous vehicle navigation and controlDeveloping novel methods and systems for autonomous vehicle navigation and control, using deep learning with monocular vision to enhance spatial reasoning and scene understanding in challenging domains.
Supervisors
Developing novel methods and systems for autonomous vehicle navigation and control, using deep learning with monocular vision to enhance spatial reasoning and scene understanding in challenging domains.
My qualifications
ResearchResearch interests
Research focuses on the application of computer vision and deep learning to solve problems in autonomous robotics and transfer these techniques across multiple domains.
Research interests include:
- Monocular Bird's Eye View (BEV) prediction
- Simultaneous Localisation and Mapping (SLAM)
- Generative Adversarial Networks (GANs), Simulation and Domain Transfer
- Appearance transfer and conditional diffusion models
Research interests
Research focuses on the application of computer vision and deep learning to solve problems in autonomous robotics and transfer these techniques across multiple domains.
Research interests include:
- Monocular Bird's Eye View (BEV) prediction
- Simultaneous Localisation and Mapping (SLAM)
- Generative Adversarial Networks (GANs), Simulation and Domain Transfer
- Appearance transfer and conditional diffusion models
Publications
—The ability to produce large-scale maps for navigation , path planning and other tasks is a crucial step for autonomous agents, but has always been challenging. In this work, we introduce BEV-SLAM, a novel type of graph-based SLAM that aligns semantically-segmented Bird's Eye View (BEV) predictions from monocular cameras. We introduce a novel form of occlusion reasoning into BEV estimation and demonstrate its importance to aid spatial aggregation of BEV predictions. The result is a versatile SLAM system that can operate across arbitrary multi-camera configurations and can be seamlessly integrated with other sensors. We show that the use of multiple cameras significantly increases performance, and achieves lower relative error than high-performance GPS. The resulting system is able to create large, dense, globally-consistent world maps from monocular cameras mounted around an ego vehicle. The maps are metric and correctly-scaled, making them suitable for downstream navigation tasks.