unimelb,

No.24-17 Mitigating Distribution Shifts in Using Pre-trained Vision-Language Models

Follow Dec 02, 2024 · 1 min read
No.24-17 Mitigating Distribution Shifts in Using Pre-trained Vision-Language Models
Share this

Benefiting from large-scale image-text pair datasets, powerful pre-trained vision-language models (VLMs, such as CLIP) enable many real-world applications, e.g., zero-shot classification and image-text retrieval. However, for many real-world applications, their datasets have different data distributions from the dataset used to train VLMs, might causing poor performance (based on machine learning theory) when we use these VLMs. To mitigate the negative effects brought from these shifts, we normally try to 1) fine-tune a pre-trained VLM with downstream tasks or 2) further improve the generalisation ability of a pre-trained VLM. In this talk, I will first introduce our recent work (one oral paper and one poster paper in ICML 2024) in both directions of 1) and 2). Then, I will introduce another work on how to detect label set shift when using pre-trained VLMs in zero-shot classification (one spotlight paper in ICLR 2024).

Speaker Bio

Dr Feng Liu is a machine learning researcher with research interests in hypothesis testing and trustworthy machine learning. Currently, he is the recipient of the ARC DECRA Fellowship, a Lecturer at The University of Melbourne, Australia, and a Visiting Scientist at RIKEN-AIP, Japan. He has served as an Area Chair for ACM MM, AISTATS, ICLR, ICML, NeurIPS, as a senior program committee (SPC) member for AAAI, IJCAI, ECAI. He has received the ARC Discovery Early Career Researcher Award, the Outstanding Paper Award of NeurIPS (2022), the Outstanding Area Chair Award of ACM MM (2024), the Outstanding Reviewer Award of NeurIPS (2021), and the Outstanding Reviewer Award of ICLR (2021).

More Details

Join Newsletter
Get the latest news right in your inbox. We never spam!