11am - 12 noon
Wednesday 11 December 2024
Generative Models for Computational Relighting
PhD Viva Open Presentation - Nikolina Kubiak
Hybrid event - All Welcome!
Free
University of Surrey
Guildford
Surrey
GU2 7XH
Generative Models for Computational Relighting
ABSTRACT:
Relighting is a long-standing computer vision problem. The task has applications ranging from simple image beautification to domain adaptation, diversification of autonomous vehicle training data, or enhancements in the TV and film production world. In this thesis, we focus on the final area. Motivated by broadcast applications, we explore the idea of capturing visual data under sub-optimal lighting conditions and then adjusting it automatically in post-production. Doing so could meaningfully simplify the process of shooting footage, allow for cheaper coverage of smaller or more resource-constrained events and, thus, wider broadcasting. Driven by the recent advances in deep learning and computer vision, we decide to tackle this problem using powerful generative models.
Capturing high quality training data at scale and with meaningful complexity is a challenge. Consequently, we strive to create a system with relaxed supervision requirements. With this in mind, we propose a self-supervised and domain-independent relighting model. Instead of relying on ground truth, our GAN-based solution exploits the rich information contained in the input data and learns desired illumination style from a collection of unsorted examples, not a directly aligned reference. This flexible approach allows us to use a loose definition of the lighting style and, potentially, to adapt to styles existing in already captured materials, even those coming from other environments.
Our preliminary solution performs accurate colour and brightness adjustments yet exhibits subpar shadow manipulation performance. Therefore, we decide to address this aspect more explicitly and in our next two technical chapters we investigate the problems of shadow removal and detection. To perform de-shadowing, we adjust our self-supervised system to the new task and modify its losses to account for inconsistencies existing in the benchmark datasets. The resulting system is capable of removing the shadowed areas without significant boundary residue, providing superior visual results. We explore the idea of shadows further in the next technical chapter and consider the common misclassification of dark areas which are often confused for shadows. To alleviate this, we create 2 datasets featuring diverse objects and backgrounds as well as cast and self-cast shadows. We then use this data to create a 3D-aided shadow-caster verification system, identifying sources of real shadows and discouraging the detection of ‘fake’ shadows, i.e. dark or patterned image regions.
In our final technical chapter, we take a step back from the fragmented approach and instead consider the most recent foundation models and the wealth of information and world understanding contained within them. We pair the successful design choices from the earlier stages of the PhD with lighting-conditioned diffusion models, and explore their applicability and adaptability to the task of relighting.