11am - 12 noon
Thursday 28 November 2024
Toward Efficient On-device AI: the Model, the Training, and the Deployment
CVSSP & PAI External Seminar - ALL WELCOME!
Speaker: Dr Fuwen Tan
Research Scientist at the Samsung AI Center, Cambridge (SAIC)
Free
University of Surrey
Guildford
Surrey
GU2 7XH
This event has passed
Toward Efficient On-device AI: the Model, the Training, and the Deployment
Abstract:
Large foundation models have revolutionized Computer vision, Machine learning, and Natural Language Processing. However, these models typically run on cloud servers due to their size and computational demands, leading to high costs and privacy concerns related to user data storage.
These challenges have spurred the development of on-device AI models for deployment on edge devices like smartphones, tablets, IoT devices, and computers. The key benefits of on-device AI include: i) Improved
Privacy: Data processing occurs on the device, eliminating the need to send sensitive information to the cloud. ii) Lower Latency: Local processing reduces delays, enhancing real-time applications like voice assistants. iii) Reduced Bandwidth Use: On-device processing minimizes data transfer, conserving network bandwidth. iv) Offline
Functionality: AI-powered applications can operate without an internet connection, crucial for remote or low-connectivity environments.
Despite their advantages, developing mobile-friendly models presents several challenges in model design, pretraining, and deployment. In this talk, Dr. Tan will discuss three research works aiming to address these challenges. i) EdgeViTs, a new family of lightweight vision transformers (ViTs) that enable attention-based vision models to compete with leading lightweight convolutional neural networks (CNNs).
ii) SSLight, a self-supervised representation learning approach for low-compute neural networks that achieves state-of-the-art performance without relying on knowledge distillation. iii) MobileQuant, a near-lossless quantization approach for large language models (LLMs) that reduces latency and energy consumption by 20-50% compared to current on-device quantization strategies.
Speaker:
Dr. Fuwen Tan is a Research Scientist at the Samsung AI Center, Cambridge (SAIC). He received his Ph.D. degree from the University of Virginia (U.Va.). He has conducted research at several institutions, including Nanyang Technological University, Adobe Research, Amazon A9, and Honda Research. Dr. Tan is interested in developing efficient solutions for Computer Vision, Machine Learning, and Natural Language Processing. His work was selected as a Best Paper Finalist at CVPR 2019.