Maggie Kosek
About
Biography
Maggie Kosek is a technical artist with a fine arts background. She has a passion for art and digital media techniques and for supporting and promoting research activities in computer vision and graphics. She received her M.Sc. in Philosophy and Formal Logic in 2006, and B.A. (Hons) in Digital Media in 2011. She currently works as a technical artist at The Centre for Vision, Speech and Signal Processing (CVSSP) at the University of Surrey. Previously, Maggie worked at the Interactive Graphics Group at Disney Research, where she was also involved in technical and digital art education and mentoring programs at universities in Scotland.
Publications
Shadow-retargeting maps depict the appearance of real shadows to virtual shadows given corresponding deformation of scene geometry, such that appearance is seamlessly maintained. By performing virtual shadow reconstruction from unoccluded real-shadow samples observed in the camera frame, this method efficiently recovers deformed shadow appearance. In this manuscript, we introduce a light-estimation approach that enables light-source detection using flat Fresnel lenses that allow this method to work without a set of pre-established conditions. We extend the adeptness of this approach by handling scenarios with multiple receiver surfaces and a non-grounded occluder with high accuracy. Results are presented on a range of objects, deformations, and illumination conditions in real-time Augmented Reality (AR) on a mobile device. We demonstrate the practical application of the method in generating otherwise laborious in-betweening frames for 3D printed stop-motion animation
We introduce Shadow Retargeting which maps real shadow appearance to virtual shadows given a corresponding deformation of scene geometry, such that appearance is seamlessly maintained. By performing virtual shadow reconstruction from un-occluded real shadow samples observed in the camera frame, we recover the deformed shadow appearance efficiently. Our method uses geometry priors for the shadow casting object and a planar receiver surface. Inspired by image retargeting approaches [VTP* 10] we describe a novel local search strategy, steered by importance based deformed shadow estimation. Results are presented on a range of objects, deformations and illumination conditions in real-time Augmented Reality (AR) on a mobile device. We demonstrate the practical application of the method in generating otherwise laborious in-betweening frames for 3D printed stop motion animation.
In this paper, we propose a depth image and video codec based on block compression, that exploits typical characteristics of depth streams, drawing inspiration from S3TC texture compression and geometric wavelets.
Pre-calculated depth information is essential for efficient light field video rendering, due to the prohibitive cost of depth estimation from color when real-time performance is desired. Standard state-of-the-art video codecs fail to satisfy such performance requirements when the amount of data to be decoded becomes too large. In this paper, we propose a depth image and video codec based on block compression, that exploits typical characteristics of depth streams, drawing inspiration from S3TC texture compression and geometric wavelets. Our codec offers very fast hardware-accelerated decoding that also allows partial extraction for view-dependent decoding. We demonstrate the effectiveness of our codec in a number of multi-view 360-degree video datasets, with quantitative analysis of storage cost, reconstruction quality, and decoding performance.
We propose an end-to-end solution for presenting movie quality animated graphics to the user while still allowing the sense of presence afforded by free viewpoint head motion. By transforming offline rendered movie content into a novel immersive representation, we display the content in real-time according to the tracked head pose. For each frame, we generate a set of cubemap images per frame (colors and depths) using a sparse set of of cameras placed in the vicinity of the potential viewer locations. The cameras are placed with an optimization process so that the rendered data maximise coverage with minimum redundancy, depending on the lighting environment complexity. We compress the colors and depths separately, introducing an integrated spatial and temporal scheme tailored to high performance on GPUs for Virtual Reality applications. A view-dependent decompression algorithm decodes only the parts of the compressed video streams that are visible to users. We detail a real-time rendering algorithm using multi-view ray casting, with a variant that can handle strong view dependent effects such as mirror surfaces and glass. Compression rates of 150:1 and greater are demonstrated with quantitative analysis of image reconstruction quality and performance.
We present immersive storytelling in VR enhanced with non-linear sequenced sound, touch and light. Our Deep Media aim is to allow for guests to physically enter rendered movies with novel non-linear storytelling capability. With the ability to change the outcome of the story through touch and physical movement, we enable the agency of guests to make choices with consequences in immersive movies. We extend, IRIDiuM to allow branching streams of full-motion light field video depending on user actions in real time. The interactive narrative guides guests through the immersive story with lighting and spatial audio design and integrates both walkable and air haptic actuators.
Compelling virtual reality experiences require high quality imagery as well as head motion with six degrees of freedom. Most existing systems limit the motion of the viewer, by using a prerecorded fixed position 360-degree video panoramas, or are limited in realism, by using video game quality graphics rendered in real-time on low powered devices. We propose a solution for presenting movie quality graphics to the user while still allowing the sense of presence afforded by free viewpoint head motion. By transforming offline rendered movie content into a novel immersive representation, we display the content in real-time according to the tracked head pose. For each frame, we generate a set of 360-degree images per frame (colors and depths) using a sparse set of of cameras placed in the vicinity of the potential viewer locations. We compress the colors and depths separately, using codecs tailored to the data. At runtime, we recover the visible video data using view-dependent decompression and render them using a raycasting algorithm that does on-the-fly scene reconstruction. Compression rates of 150:1 and greater are demonstrated with quantitative analysis of image reconstruction quality and performance.
Stop motion animation evolved in the early days of cinema with the aim to create an illusion of movement with static puppets posed manually each frame. Current stop motion movies introduced 3D printing processes in order to acquire animations more accurately and rapidly. However, due to the nature of this technique, every frame needs to be computer-generated, 3D printed and post-processed before it can be recorded. Therefore, a typical stop motion film could require many thousands of props to be created, resulting in a laborious and expensive production. We address this with a real-time interactive Augmented Reality system which generates virtual in-between poses according to a reduced number of key frame physical props. We perform deformation of the surface camera samples to accomplish smooth animations with retained visual appearance and incorporate a diminished reality method to allow virtual deformations that would, otherwise, reveal undesired background behind the animated mesh.
Underpinning this solution is a principled interaction and system design which forms our Props Alive framework. We apply established models of interactive system design, drawing from an information visualisation framework which, appropriately for Augmented Reality, includes consideration of the user, interaction, data and presentation elements necessary for real-time. The rapid development framework and high performance architecture is detailed with an analysis of resulting performance.
Link to publication: https://www.napier.ac.uk/research-and-innovation/research-search/outputs/props-alive-a-framework-for-augmented-reality-stop-motion-animation
We present a system for rapid acquisition of bespoke, animatable, full-body avatars including face texture and shape. A blendshape rig with a skeleton is used as a template for customization. Identity blendshapes are used to customize the body and face shape at the fitting stage, while animation blendshapes allow the face to be animated. The subject assumes a T-pose and a single snapshot is captured using a stereo RGB plus depth sensor rig. Our system automatically aligns a photo texture and fits the 3D shape of the face. The body shape is stylized according to body dimensions estimated from segmented depth. The face identity blendweights are optimised according to image-based facial landmarks, while a custom texture map for the face is generated by warping the input images to a reference texture according to the facial landmarks.
The total capture and processing time is under 10~seconds and the output is a light-weight, game-ready avatar which is recognizable as the subject. We demonstrate our system in a VR environment in which each user sees the other users' animated avatars through an HMD with real-time audio-based facial animation and live body motion tracking, affording an enhanced level of presence and social engagement compared to generic avatars.
Compelling virtual reality experiences require high quality imagery as well as head motion with six degrees of freedom. Most existing systems limit the motion of the viewer (prerecorded fixed position 360 video panoramas), or are limited in realism, e.g. video game quality graphics rendered in real-time on low powered devices. We propose a solution for presenting movie quality graphics to the user while still allowing the sense of presence afforded by free viewpoint head motion. By transforming offline rendered movie content into a novel immersive deep media representation, we display the content in real-time according to the tracked head pose. For each frame, we generate a set of 360-degree images (colors and depths) using cameras placed in selected locations within a small view volume surrounding a central viewing position. We employ a parallax masking technique which minimizes the rendering work required for the additionally visible surfaces in viewing locations around the main viewpoint. At run-time, a decompression and rendering algorithm fetches the appropriate surface data in real-time and projects them to the eye positions as the user moves within the tracked view volume. To further illustrate this ability for interactivity and embodiment within VR movies, we track the full upper body using our sparse sensor motion capture solver allowing users to see themselves in the virtual world. Here, both head and upper body are tracked in realtime using data from IMU (Inertial Measurement Unit) and EMG (Electromyogram) sensors. Our real-time solver, Triduna Live uses a physics-based approach to robustly estimate pose from a few sensors. Hand gesture and object grasping motions are detected from the EMG data and combined with the tracked body position to control gameplay seamlessly integrated within the deep media environment.
We propose “Stereohaptics”, a framework to create, record, modify, and playback rich and dynamic haptic media using audio based tools. These tools are well established, mainstream and familiar to a large population in entertainment, design, academic, and the DIY communities, and already available for sound synthesis, recording, and playback. We tune these audio-based tools to create haptic media on user’s bodies, distribute it to multiple slave units, and share it over the Internet. Applicable to the framework, we introduce a toolkit, the Stereohaptics toolkit, which uses off-the-shelf speaker technologies (electromagnetics, piezoelectric, electrostatic) and audio software tools to generate and embed haptic media in a variety of multisensory settings. This way, designers, artists, students and other professionals who are already familiar with sound production processes can utilize their skills to contribute towards designing haptics experiences. Moreover, using the audio infrastructure, application designers and software developers can create new applications and distribute haptic content to everyday users on mobile devices, computers, toys, game controllers and so on.
We propose that pretending is a cognitive faculty which enables us to create and immerse ourselves in possible worlds. These worlds range from the veridical to the fan- tastic and are frequently realised as stories varying from the fictional to the scientific. This same ability enables us to become immersed and engaged in such stories (which we may have created) too. Whether we are shooting “aliens” or are engaged in a passionate romance, these experiences are facilitated by our ability to pretend. While it might seem that we can imagine or make-believe anything, in practice there are limits to what we can pretend. We draw upon both theoretical perspectives and from the work practice of animators. By identifying these limits, we are, of course, also defining the nature of pretending.
Link to publication: https://www.researchgate.net/profile/Richard_Hetherington/publication/283266170_The_Limits_of_Pretending/links/562f902408ae8e12568753b3.pdf