Dr Ayan Das
About
My research project
Visit my personal website for further details and my Google Scholar profile for publication details.
A research enthusiast pursuing Doctor of Philosophy (PhD) with fully-funded university scholarship in the Centre for Vision, Speech and Signal Processing (CVSSP) in the Faculty of Engineering and Physical Sciences (FEPS) at the University of Surrey, England, United Kingdom (UK).
Before starting my PhD, I worked as a Junior Project Officer at KLIV Research group, IIT Kharagpur, India. My broad area of research is the domain of Artificial Intelligence (AI) - I am specifically interested in the emerging field of deep learning applied mainly in (but not restricted to) computer vision tasks.
Having a tremendous love for programming, I am also focused on writing good quality codes/softwares related to my field of study. I love to share knowledge and so decided to maintain a technical blog in my personal website. I am currently a part of SketchX Lab, an elite group of researchers pioneering in the field of sketch analysis, led by my PhD supervisors Dr Yi-Zhe Song, Dr Tao Xiang and Yongxin Yang. My current focus is on sketch generation problems using modern Deep Learning methods like GANs.
Supervisors
Visit my personal website for further details and my Google Scholar profile for publication details.
A research enthusiast pursuing Doctor of Philosophy (PhD) with fully-funded university scholarship in the Centre for Vision, Speech and Signal Processing (CVSSP) in the Faculty of Engineering and Physical Sciences (FEPS) at the University of Surrey, England, United Kingdom (UK).
Before starting my PhD, I worked as a Junior Project Officer at KLIV Research group, IIT Kharagpur, India. My broad area of research is the domain of Artificial Intelligence (AI) - I am specifically interested in the emerging field of deep learning applied mainly in (but not restricted to) computer vision tasks.
Having a tremendous love for programming, I am also focused on writing good quality codes/softwares related to my field of study. I love to share knowledge and so decided to maintain a technical blog in my personal website. I am currently a part of SketchX Lab, an elite group of researchers pioneering in the field of sketch analysis, led by my PhD supervisors Dr Yi-Zhe Song, Dr Tao Xiang and Yongxin Yang. My current focus is on sketch generation problems using modern Deep Learning methods like GANs.
Publications
Analysis of human sketches in deep learning has advanced immensely through the use of waypoint-sequences rather than raster-graphic representations. We further aim to model sketches as a sequence of low-dimensional parametric curves. To this end, we propose an inverse graphics framework capable of approximating a raster or waypoint based stroke encoded as a point-cloud with a variable-degree Bezier curve. Building on this module, we present Cloud2Curve, a generative model for scalable high-resolution vector sketches that can be trained end-to-end using point-cloud data alone. As a consequence, our model is also capable of deterministic vectorization which can map novel raster or waypoint based sketches to their corresponding high-resolution scalable Bezier equivalent. We evaluate the generation and vectorization capabilities of our model on Quick, Draw! and K-MNIST datasets.
The study of neural generative models of human sketches is a fascinating contemporary modeling problem due to the links between sketch image generation and the human drawing process. The landmark SketchRNN provided breakthrough by sequentially generating sketches as a sequence of waypoints. However this leads to low-resolution image generation, and failure to model long sketches. In this paper we present B´ezierSketch, a novel generative model for fully vector sketches that are automatically scalable and high-resolution. To this end, we first introduce a novel inverse graphics approach to stroke embedding that trains an encoder to embed each stroke to its best fit B´ezier curve. This enables us to treat sketches as short sequences of paramaterized strokes and thus train a recurrent sketch generator with greater capacity for longer sketches, while producing scalable high-resolution results. We report qualitative and quantitative results on the Quick, Draw! benchmark.
We present the first competitive drawing agent Pixelor that exhibits human-level performance at a Pictionary-like sketching game, where the participant whose sketch is recognized first is a winner. Our AI agent can autonomously sketch a given visual concept, and achieve a recognizable rendition as quickly or faster than a human competitor. The key to victory for the agent’s goal is to learn the optimal stroke sequencing strategies that generate the most recognizable and distinguishable strokes first. Training Pixelor is done in two steps. First, we infer the stroke order that maximizes early recognizability of human training sketches. Second, this order is used to supervise the training of a sequence-to-sequence stroke generator. Our key technical contributions are a tractable search of the exponential space of orderings using neural sorting; and an improved Seq2Seq Wasserstein (S2S-WAE) generator that uses an optimal-transport loss to accommodate the multi-modal nature of the optimal stroke distribution. Our analysis shows that Pixelor is better than the human players of the Quick, Draw! game, under both AI and human judging of early recognition. To analyze the impact of human competitors’ strategies, we conducted a further human study with participants being given unlimited thinking time and training in early recognizability by feedback from an AI judge. The study shows that humans do gradually improve their strategies with training, but overall Pixelor still matches human performance. The code and the dataset are available at http://sketchx.ai/pixelor.
We present the first competitive drawing agent Pixelor that exhibits human-level performance at a Pictionary-like sketching game, where the participant whose sketch is recognized first is a winner. Our AI agent can autonomously sketch a given visual concept, and achieve a recognizable rendition as quickly or faster than a human competitor. The key to victory for the agent’s goal is to learn the optimal stroke sequencing strategies that generate the most recognizable and distinguishable strokes first. Training Pixelor is done in two steps. First, we infer the stroke order that maximizes early recognizability of human training sketches. Second, this order is used to supervise the training of a sequence-to-sequence stroke generator. Our key technical contributions are a tractable search of the exponential space of orderings using neural sorting; and an improved Seq2Seq Wasserstein (S2S-WAE) generator that uses an optimal-transport loss to accommodate the multi-modal nature of the optimal stroke distribution. Our analysis shows that Pixelor is better than the human players of the Quick, Draw! game, under both AI and human judging of early recognition. To analyze the impact of human competitors’ strategies, we conducted a further human study with participants being given unlimited thinking time and training in early recognizability by feedback from an AI judge. The study shows that humans do gradually improve their strategies with training, but overall Pixelor still matches human performance. The code and the dataset are available at http://sketchx.ai/pixelor.
Analysis of human sketches in deep learning has advanced immensely through the use of waypoint-sequences rather than raster-graphic representations. We further aim to model sketches as a sequence of low-dimensional parametric curves. To this end, we propose an inverse graphics framework capable of approximating a raster or waypoint based stroke encoded as a point-cloud with a variable-degree Bezier curve. Building on this module, ´we present Cloud2Curve, a generative model for scalable high-resolution vector sketches that can be trained end-to-end using point-cloud data alone. As a consequence, our model is also capable of deterministic vectorization which can map novel raster or waypoint based sketches to their corresponding high-resolution scalable Bezier equivalent. ´We evaluate the generation and vectorization capabilities of our model on Quick, Draw! and K-MNIST datasets. The analysis of free-hand sketches using deep learning [40] has flourished over the past few years, with sketches now being well analysed from classification [43, 42] and retrieval [27, 12, 4] perspectives. Sketches for digital analysis have always been acquired in two primary modalities - raster (pixel grids) and vector (line segments). Raster sketches have mostly been the modality of choice for sketch recognition and retrieval [43, 27]. However, generative sketch models began to advance rapidly [16] after focusing on vector representations and generating sketches as sequences [7, 37] of waypoints/line segments, similarly to how humans sketch. As a happy byproduct, this paradigm leads to clean and blur-free image generation as opposed to direct raster-graphic generations [30]. Recent works have studied creativity in sketch generation [16], learning to sketch raster photo input images [36], learning efficient
In this paper, we democratise 3D content creation, enabling precise generation of 3D shapes from abstract sketches while overcoming limitations tied to drawing skills. We introduce a novel part-level modelling and alignment framework that facilitates abstraction modelling and cross-modal correspondence. Leveraging the same part-level decoder, our approach seamlessly extends to sketch modelling by establishing correspondence between CLIPasso edgemaps and projected 3D part regions, eliminating the need for a dataset pairing human sketches and 3D shapes. Additionally, our method introduces a seamless in-position editing process as a byproduct of cross-modal part-aligned modelling. Operating in a low-dimensional implicit space, our approach significantly reduces computational demands and processing time.
Feature Selection (FS) is an important pre-processing step in machine learning and it reduces the number of features/variables used to describe each member of a dataset. Such reduction occurs by eliminating some of the non-discriminating and redundant features and selecting a subset of the existing features with higher discriminating power among various classes in the data. In this paper, we formulate the feature selection as a bi-objective optimization problem of some real-valued weights corresponding to each feature. A subset of the weighted features is thus selected as the best subset for subsequent classification of the data. Two information theoretic measures, known as 'relevancy' and 'redundancy' are chosen for designing the objective functions for a very competitive Multi-Objective Optimization (MOO) algorithm called 'Multi-Objective Evolutionary Algorithm based on Decomposition (MOEA/D)'. We experimentally determine the best possible constraints on the weights to be optimized. We evaluate the proposed bi-objective feature selection and weighting framework on a set of 15 standard datasets by using the popular k-Nearest Neighbor (k-NN) classifier. As is evident from the experimental results, our method appears to be quite competitive to some of the state-of-the-art FS methods of current interest. We further demonstrate the effectiveness of our framework by changing the choices of the optimization scheme and the classifier to Non-dominated Sorting Genetic Algorithm (NSGA)-II and Support Vector Machines (SVMs) respectively. (C) 2017 Elsevier B.V. All rights reserved.