
Wish Suharitdamrong
Academic and research departments
Surrey Institute for People-Centred Artificial Intelligence (PAI), Centre for Vision, Speech and Signal Processing (CVSSP).About
My research project
Multi-Modal Foundation ModelsThis research proposal aims to address the limitations of current multimodal learning models, which often overlook fine-grained information in favour of global representations. In domains like multimedia (e.g., videos with images, audio, and transcripts) and healthcare (e.g., medical images and clinical data), multimodal data carry complex, overlapping semantic concepts. This research will develop novel self-supervised learning algorithms that focus on extracting and aligning fine-grained, multi-concept representations across modalities. By designing specialised neural architectures and loss functions, we will enhance the integration of multimodal data, enabling a deeper understanding of complex cross-modal relationships. This approach will have significant implications for fields like multimedia analysis and healthcare informatics, where detailed multimodal interpretation is essential.
Supervisors
This research proposal aims to address the limitations of current multimodal learning models, which often overlook fine-grained information in favour of global representations. In domains like multimedia (e.g., videos with images, audio, and transcripts) and healthcare (e.g., medical images and clinical data), multimodal data carry complex, overlapping semantic concepts. This research will develop novel self-supervised learning algorithms that focus on extracting and aligning fine-grained, multi-concept representations across modalities. By designing specialised neural architectures and loss functions, we will enhance the integration of multimodal data, enabling a deeper understanding of complex cross-modal relationships. This approach will have significant implications for fields like multimedia analysis and healthcare informatics, where detailed multimodal interpretation is essential.