The dynamic realm of Vision-and-Language Navigation (VLN) has garnered significant multidisciplinary interest, resonating within the domains of computer vision, natural language processing, and robotics. This presentation embarks on a comprehensive exploration of the VLN trajectory, tracing its inception to seminal benchmarks such as Room-to-Room (R2R). A pivotal catalyst within this evolution is the advent of Large Language Models (LLMs), exemplified by the transformative GPT-4. These LLMs have not only facilitated more natural and fluid human-machine interactions but also unlocked novel pathways for leveraging human-like language in guiding robots through intricate navigational tasks. The discourse commences by establishing the contextual framework of human-machine conversational dynamics, contextualizing the paradigm shift and its reverberations. Subsequently, a detailed exposition of our recent undertakings in the VLN domain is presented. This involves harnessing the prowess of LLMs to decode complex navigational instructions embedded within natural language, thereby elevating robotic navigational capabilities. The presentation serves as an illuminating window into the transformative potential of merging vision, language, and robotics.
Speaker Bio
Dr Qi Wu is an Associate Professor at the University of Adelaide and was the ARC Discovery Early Career Researcher Award (DECRA) Fellow between 2019-2021. He is the Director of Vision-and-Language at the Australia Institute of Machine Learning. Australian Academy of Science awarded him a J G Russell Award in 2019. He obtained his PhD degree in 2015 and MSc degree in 2011, in Computer Science from the University of Bath, United Kingdom. His research interests are mainly in computer vision and machine learning. Currently, he is working on the vision-language problem, and he is primarily an expert in image captioning and visual question answering (VQA). He has published more than 130 papers in prestigious conferences and journals, such as TPAMI, CVPR, ICCV, ECCV. He is also the Area Chair for CVPR, ICCV and NeurIPS.
More Details
- When: Tue 08 Oct 2024, at 11:00 am - 12:00 pm (Brisbane time)
- Speaker: A/Prof Qi Wu (University of Adelaide)
- Host: Prof Helen Huang
- Venue: 78-343, General Purpose South
- Zoom: https://uqz.zoom.us/j/85098152567 [Recording]