About

My research project

Publications

Yi-Zhe Song, Ayan Das, Yongxin Yang, Timothy Hospedales, Tao Xiang (2020)BézierSketch: A generative model for scalable vector sketches, In: Computer Vision – ECCV 2020pp. 632-647 Springer International Publishing

The study of neural generative models of human sketches is a fascinating contemporary modeling problem due to the links between sketch image generation and the human drawing process. The landmark SketchRNN provided breakthrough by sequentially generating sketches as a sequence of waypoints. However this leads to low-resolution image generation, and failure to model long sketches. In this paper we present B´ezierSketch, a novel generative model for fully vector sketches that are automatically scalable and high-resolution. To this end, we first introduce a novel inverse graphics approach to stroke embedding that trains an encoder to embed each stroke to its best fit B´ezier curve. This enables us to treat sketches as short sequences of paramaterized strokes and thus train a recurrent sketch generator with greater capacity for longer sketches, while producing scalable high-resolution results. We report qualitative and quantitative results on the Quick, Draw! benchmark.

AYAN DAS, YONGXIN YANG, Timothy M. Hospedales, TAO XIANG, YI-ZHE SONG (2021)Cloud2Curve: Generation and Vectorization of Parametric Sketches

Analysis of human sketches in deep learning has advanced immensely through the use of waypoint-sequences rather than raster-graphic representations. We further aim to model sketches as a sequence of low-dimensional parametric curves. To this end, we propose an inverse graphics framework capable of approximating a raster or waypoint based stroke encoded as a point-cloud with a variable-degree Bezier curve. Building on this module, ´we present Cloud2Curve, a generative model for scalable high-resolution vector sketches that can be trained end-to-end using point-cloud data alone. As a consequence, our model is also capable of deterministic vectorization which can map novel raster or waypoint based sketches to their corresponding high-resolution scalable Bezier equivalent. ´We evaluate the generation and vectorization capabilities of our model on Quick, Draw! and K-MNIST datasets. The analysis of free-hand sketches using deep learning [40] has flourished over the past few years, with sketches now being well analysed from classification [43, 42] and retrieval [27, 12, 4] perspectives. Sketches for digital analysis have always been acquired in two primary modalities - raster (pixel grids) and vector (line segments). Raster sketches have mostly been the modality of choice for sketch recognition and retrieval [43, 27]. However, generative sketch models began to advance rapidly [16] after focusing on vector representations and generating sketches as sequences [7, 37] of waypoints/line segments, similarly to how humans sketch. As a happy byproduct, this paradigm leads to clean and blur-free image generation as opposed to direct raster-graphic generations [30]. Recent works have studied creativity in sketch generation [16], learning to sketch raster photo input images [36], learning efficient

Hmrishav Bandyopadhyay, Subhadeep Koley, Ayan Das, Aneeshan Sain, Pinaki Nath Chowdhury, Tao Xiang, Ayan Kumar Bhunia, Yi-Zhe Song (2024)Doodle Your 3D: From Abstract Freehand Sketches to Precise 3D Shapes

In this paper, we democratise 3D content creation, enabling precise generation of 3D shapes from abstract sketches while overcoming limitations tied to drawing skills. We introduce a novel part-level modelling and alignment framework that facilitates abstraction modelling and cross-modal correspondence. Leveraging the same part-level decoder, our approach seamlessly extends to sketch modelling by establishing correspondence between CLIPasso edgemaps and projected 3D part regions, eliminating the need for a dataset pairing human sketches and 3D shapes. Additionally, our method introduces a seamless in-position editing process as a byproduct of cross-modal part-aligned modelling. Operating in a low-dimensional implicit space, our approach significantly reduces computational demands and processing time.

Ayan Kumar Bhunia, Ayan Das, Umar Riaz Muhammad, Yongxin Yang, Timothy M Hospedales, Tao Xiang, Yulia Gryaditskaya, Yi-Zhe Song (2020)Pixelor: A Competitive Sketching AI Agent. So you think you can beat me? Association for Computing Machinery (ACM)

We present the first competitive drawing agent Pixelor that exhibits human-level performance at a Pictionary-like sketching game, where the participant whose sketch is recognized first is a winner. Our AI agent can autonomously sketch a given visual concept, and achieve a recognizable rendition as quickly or faster than a human competitor. The key to victory for the agent’s goal is to learn the optimal stroke sequencing strategies that generate the most recognizable and distinguishable strokes first. Training Pixelor is done in two steps. First, we infer the stroke order that maximizes early recognizability of human training sketches. Second, this order is used to supervise the training of a sequence-to-sequence stroke generator. Our key technical contributions are a tractable search of the exponential space of orderings using neural sorting; and an improved Seq2Seq Wasserstein (S2S-WAE) generator that uses an optimal-transport loss to accommodate the multi-modal nature of the optimal stroke distribution. Our analysis shows that Pixelor is better than the human players of the Quick, Draw! game, under both AI and human judging of early recognition. To analyze the impact of human competitors’ strategies, we conducted a further human study with participants being given unlimited thinking time and training in early recognizability by feedback from an AI judge. The study shows that humans do gradually improve their strategies with training, but overall Pixelor still matches human performance. The code and the dataset are available at http://sketchx.ai/pixelor.

Ayan Kumar Bhunia, Ayan Das, Umar Riaz Muhammad, Yongxin Yang, Timothy M. Hospedales, Tao Xiang, Yulia Gryaditskaya, Yi-Zhe Song (2020)Pixelor: A Competitive Sketching AI Agent. So you think you can sketch?, In: ACM Transactions on Graphics39(6) Association for Computing Machinery (ACM)

We present the first competitive drawing agent Pixelor that exhibits human-level performance at a Pictionary-like sketching game, where the participant whose sketch is recognized first is a winner. Our AI agent can autonomously sketch a given visual concept, and achieve a recognizable rendition as quickly or faster than a human competitor. The key to victory for the agent’s goal is to learn the optimal stroke sequencing strategies that generate the most recognizable and distinguishable strokes first. Training Pixelor is done in two steps. First, we infer the stroke order that maximizes early recognizability of human training sketches. Second, this order is used to supervise the training of a sequence-to-sequence stroke generator. Our key technical contributions are a tractable search of the exponential space of orderings using neural sorting; and an improved Seq2Seq Wasserstein (S2S-WAE) generator that uses an optimal-transport loss to accommodate the multi-modal nature of the optimal stroke distribution. Our analysis shows that Pixelor is better than the human players of the Quick, Draw! game, under both AI and human judging of early recognition. To analyze the impact of human competitors’ strategies, we conducted a further human study with participants being given unlimited thinking time and training in early recognizability by feedback from an AI judge. The study shows that humans do gradually improve their strategies with training, but overall Pixelor still matches human performance. The code and the dataset are available at http://sketchx.ai/pixelor.

Ayan Das, Swagatam Das (2017)Feature weighting and selection with a Pareto-optimal trade-off between relevancy and redundancy, In: Pattern recognition letters88pp. 12-19 Elsevier

Feature Selection (FS) is an important pre-processing step in machine learning and it reduces the number of features/variables used to describe each member of a dataset. Such reduction occurs by eliminating some of the non-discriminating and redundant features and selecting a subset of the existing features with higher discriminating power among various classes in the data. In this paper, we formulate the feature selection as a bi-objective optimization problem of some real-valued weights corresponding to each feature. A subset of the weighted features is thus selected as the best subset for subsequent classification of the data. Two information theoretic measures, known as 'relevancy' and 'redundancy' are chosen for designing the objective functions for a very competitive Multi-Objective Optimization (MOO) algorithm called 'Multi-Objective Evolutionary Algorithm based on Decomposition (MOEA/D)'. We experimentally determine the best possible constraints on the weights to be optimized. We evaluate the proposed bi-objective feature selection and weighting framework on a set of 15 standard datasets by using the popular k-Nearest Neighbor (k-NN) classifier. As is evident from the experimental results, our method appears to be quite competitive to some of the state-of-the-art FS methods of current interest. We further demonstrate the effectiveness of our framework by changing the choices of the optimization scheme and the classifier to Non-dominated Sorting Genetic Algorithm (NSGA)-II and Support Vector Machines (SVMs) respectively. (C) 2017 Elsevier B.V. All rights reserved.

N Wang, K-H Ho, G Pavlou (2008)Adaptive multi-topology IGP based traffic engineering with near-optimal network performance, In: A Das, HK Pung, FBS Lee, LWC Wong (eds.), NETWORKING 2008: AD HOC AND SENSOR NETWORKS, WIRELESS NETWORKS, NEXT GENERATION INTERNET, PROCEEDINGS4982pp. 654-666