Dr Suparna De
Academic and research departments
Surrey Institute for People-Centred Artificial Intelligence (PAI), School of Computer Science and Electronic Engineering.About
Biography
Suparna De is a Lecturer in the Department of Computer Science, a Surrey AI Fellow in the Surrey Institute for People-Centred AI and an Honorary Senior Research Fellow in the Social Research Institute at UCL, UK. She is also a visiting lecturer at the University of Granada, Spain, teaching on the Master's degree in Software Development, Data Processing and Analysis - Representation and Processing of Information and Semantic Web (RTIWS: Representación y Tratamiento de la Información y Web Semántica).
She is a member of the Nature Inspired Computing and Engineering (NICE) research group.
She completed her Ph.D. and MSc. (with distinction) degrees at the Dept. of Electronic Engineering from the University of Surrey.
She serves on the editorial board of Nature Scientific Reports (Computational Science division) and Elsevier High-Confidence Computing journals.
Areas of specialism
University roles and responsibilities
- Computer Science Admissions Tutor (UG)
Affiliations and memberships
News
In the media
ResearchResearch interests
Suparna's research applies machine learning and Semantic Web technologies to the broad domain of knowledge and data engineering, including deep learning for text data (derived from social networks and longitudinal social science datasets), semantic modelling and search and IoT data analytics.
Her current work focusses on researching machine comprehension and information extraction for longitudinal text, applied to metadata extraction and uplift for social science questionnaires, and adaptive privacy-preserving models for online social networks.
Previous work has included research in Big Data analytics, visualisation and data fusion techniques for understanding city dynamics from multimodal spatio-temporal data.
Research projects
ESRC-funded, (Surrey) Principal-Investigator (PI), April 2024 - March 2025.
This one-year, £757k project is a multi-disciplinary collaboration between social science and computer science, with partners from the University of Surrey, CLOSER, UCL, ScotCen and the UK Data Archive (UKDA) at the University of Essex.
This project aims to develop novel ML models tailored to the specific challenges of semantically rich survey data collection and research datasets. The project is aimed at the alignment of both structural (standards) and semantic metadata (controlled vocabularies and conceptual frameworks), where the output from the developed ML models will be used to create metadata resources. These metadata resources, realised as knowledge graphs of the questionnaire items, will extend recent advances in text-layout ML models. Additionally, the project contributes to enabling automated risk disclosure assessments in large UKDA datasets by developing novel algorithms for question text equivalence, addressing the challenge of semantic shifts in large longitudinal studies. The project meets the evolving needs of researchers from a range of disciplines who utilise longitudinal population survey (LPS) and other survey data.
EPSRC-funded, Co-Investigator (Co-I), April 2022 - March 2025.
This three-year, £3.4 million project will produce Privacy-Enhancing Technologies (PETs) to support the online privacy and safety of people navigating significant life transitions. The researchers on the project are comprised of a multi-disciplinary team of experts in Cybersecurity, Psychology, Law, Business, and Criminology from the Universities of Cambridge, Edge Hill, Edinburgh, Queen Mary University of London, Strathclyde and Surrey.
KTP - Predictive Maintenance for Rail InfrastructureInnovateUK-funded, Co-Principal Investigator (PI) and Academic Lead, August 2023 - Jan 2026. Grant number: 10054741
This 30-month, £212k Knowledge Transfer partnership (KTP) grant, funded by Innovate UK, will develop novel machine learning algorithms and models for a cloud-based predictive maintenance platform that also enables real-time monitoring and risk prediction of IoT multi-sensors that sense critical parts of the rail infrastructure (i.e. tracks, bridges, structures supporting overhead lines).
Understanding the multiple dimensions of prediction of concepts in social and biomedical science questionnairesScience and Technology Facilities (STFC) DiRAC-funded, PI, Oct. '21 - March '22.
Part of Grant Number: ST/S003916/1
This grant is a collaboration with CLOSER (Cohort and Longitudinal Studies Enhancement Resources), UCL, and RITS (Research IT Services), UCL. The project investigates various dimensions of concept prediction: against a range of different types of unseen data from UK Data Archive's longitudinal studies, e.g. social science vs biomedical, 1995 vs 2015; and to build up an understanding of different predictions rate by category. Hierarchical approaches for topic classification against the European Language Social Science Thesaurus (ELSST) thesaurus were developed.
ESRC-funded, PI, Feb. 2021 - Feb. 2022.
Total funding amount: £81,500. Part of Grant Number: ES/K000357/1
This grant is a collaboration with CLOSER, UCL. The project investigates automated capture of metadata from the CLOSER longitudinal population studies. Automation of question extraction from paper questionnaires will form part of a pipeline to populate question banks and other metadata repositories and provide a low cost solution to the manual processes undertaken as part of the CLOSER project and UKDA (and other archives) to enhance survey metadata alongside the data description.
Science and Technology Facilities (STFC) DiRAC-funded, PI, Feb. - Dec. 2021. Part of Grant Number: ST/S003916/1.
This grant is a collaboration with CLOSER, UCL, and RITS, UCL. The project will utilise the questions and linked concepts (based on the European Language Social Science Thesaurus (ELSST)) held in CLOSER Discovery. The aim is to create a model that will be able to classify existing questions (and predict from new questions) to these existing concepts in ELSST.
TagItSmart (2015-18) is a Smart Tags driven service platform for enabling ecosystems of connected objects that dynamically change their status in response to a variety of environmental factors and be seamlessly tracked during their lifecycle. TagItSmart is developing smart tags that combine the power of functional inks with the pervasiveness of digital and electronic markers, e.g. dynamic QR codes, NFC tags, in order to capture new contextual information. Beside this, the ubiquitous presence of smartphones with their cameras and NFC readers facilitates seamless observation measurements and lifecycle tracking of smart tag big data.
I design and develop semantic models and reasoning algorithms for capturing the characteristics of the Smart Tags and to provide decision-support mechanisms for connecting their lifecycle data to semantically-enabled workflows.
The EU H2020 frontierCities2 project (December 2016 - November 2018) is a acceleration and incubation programme that aims to accelerate market update of the FIWARE generic enablers in the Internet of Things and Smart Cities domain by targeting SMEs and startups to develop FIWARE-enabled smart mobility and smart city solutions.
I work together with the FIWARE Foundation to deliver technical coaching to the startups and lead the work-package tasked with further developing the FIWARE enablers and support mechanisms.
The iKaaS project (Oct 2015 - Oct 2017), jointly funded by the EC H2020 programme and MIC, Japan, delivers a secure, robust and scalable multi-cloud platform that brings together the paradigms of IoT, Big Data and the Cloud.
I researched aspects of data analysis in smart city platforms built on heterogeneous cloud platforms. As part of this, we developed novel anomaly detection algorithms for city environmental features (such as measured air pollution levels) and real-time detection of city social events by analysing Twitter feeds. The research delivered tools to discover correlation between large-scale city events and anomalies detected in pollution levels.
The IoT.est project (2011-14), funded by the EU FP7 Programme, developed a test-driven service creation environment for Internet of Things enabled business services. I served as the work-package leader for semantic annotation and large-scale discovery of IoT services, as well as the University of Surrey technical coordinator. I was also in-charge of the proof-of-concept project demonstrator that integrated modules from project partners.
The iCore project (2011-14), funded by the EU FP7 Programme, defined the concept of Virtual Objects (VO) to abstract the technological heterogeneity derived from the vast amounts of heterogeneous objects and devices forming part of the IoT. I researched the dynamic association derivation between ICT and real-world objects and service workflow composition in iCore.
The IoT-A project (2010-13), funded by the EU FP7 Programme, was the lighthouse EU project that defined a reference architecture and model for the IoT, including defining its constituent concepts such as entity, resource and IoT service. I was the deputy leader of the WP that researched various mechanisms for resolution framewoks for the IoT.
MVCE Core 4I worked as a PhD researcher on the Mobile VCE Core 4 Programme on Ubiquitous Services (funded by the UK Technology Strategy Board), work area: ontology-based context management for mobile systems.
Indicators of esteem
Scholarships:
- Overseas Research Student Sponsorship for PhD research: University of Surrey and MobileVCE Core 4 programme
- DFIDSSS Scholarship: jointly funded by the University of Surrey and the British Commonwealth Scholarship Commission for MSc programme.
Awards:
- Cable and Wireless Award: University of Surrey, for the best overall performance from a student graduating with an MSc in Satellite Communication Engineering or Communications Networks and Software
- IET Certificate in recognition of significant contribution to IET On Campus at the University of Surrey
Research interests
Suparna's research applies machine learning and Semantic Web technologies to the broad domain of knowledge and data engineering, including deep learning for text data (derived from social networks and longitudinal social science datasets), semantic modelling and search and IoT data analytics.
Her current work focusses on researching machine comprehension and information extraction for longitudinal text, applied to metadata extraction and uplift for social science questionnaires, and adaptive privacy-preserving models for online social networks.
Previous work has included research in Big Data analytics, visualisation and data fusion techniques for understanding city dynamics from multimodal spatio-temporal data.
Research projects
ESRC-funded, (Surrey) Principal-Investigator (PI), April 2024 - March 2025.
This one-year, £757k project is a multi-disciplinary collaboration between social science and computer science, with partners from the University of Surrey, CLOSER, UCL, ScotCen and the UK Data Archive (UKDA) at the University of Essex.
This project aims to develop novel ML models tailored to the specific challenges of semantically rich survey data collection and research datasets. The project is aimed at the alignment of both structural (standards) and semantic metadata (controlled vocabularies and conceptual frameworks), where the output from the developed ML models will be used to create metadata resources. These metadata resources, realised as knowledge graphs of the questionnaire items, will extend recent advances in text-layout ML models. Additionally, the project contributes to enabling automated risk disclosure assessments in large UKDA datasets by developing novel algorithms for question text equivalence, addressing the challenge of semantic shifts in large longitudinal studies. The project meets the evolving needs of researchers from a range of disciplines who utilise longitudinal population survey (LPS) and other survey data.
EPSRC-funded, Co-Investigator (Co-I), April 2022 - March 2025.
This three-year, £3.4 million project will produce Privacy-Enhancing Technologies (PETs) to support the online privacy and safety of people navigating significant life transitions. The researchers on the project are comprised of a multi-disciplinary team of experts in Cybersecurity, Psychology, Law, Business, and Criminology from the Universities of Cambridge, Edge Hill, Edinburgh, Queen Mary University of London, Strathclyde and Surrey.
InnovateUK-funded, Co-Principal Investigator (PI) and Academic Lead, August 2023 - Jan 2026. Grant number: 10054741
This 30-month, £212k Knowledge Transfer partnership (KTP) grant, funded by Innovate UK, will develop novel machine learning algorithms and models for a cloud-based predictive maintenance platform that also enables real-time monitoring and risk prediction of IoT multi-sensors that sense critical parts of the rail infrastructure (i.e. tracks, bridges, structures supporting overhead lines).
Science and Technology Facilities (STFC) DiRAC-funded, PI, Oct. '21 - March '22.
Part of Grant Number: ST/S003916/1
This grant is a collaboration with CLOSER (Cohort and Longitudinal Studies Enhancement Resources), UCL, and RITS (Research IT Services), UCL. The project investigates various dimensions of concept prediction: against a range of different types of unseen data from UK Data Archive's longitudinal studies, e.g. social science vs biomedical, 1995 vs 2015; and to build up an understanding of different predictions rate by category. Hierarchical approaches for topic classification against the European Language Social Science Thesaurus (ELSST) thesaurus were developed.
ESRC-funded, PI, Feb. 2021 - Feb. 2022.
Total funding amount: £81,500. Part of Grant Number: ES/K000357/1
This grant is a collaboration with CLOSER, UCL. The project investigates automated capture of metadata from the CLOSER longitudinal population studies. Automation of question extraction from paper questionnaires will form part of a pipeline to populate question banks and other metadata repositories and provide a low cost solution to the manual processes undertaken as part of the CLOSER project and UKDA (and other archives) to enhance survey metadata alongside the data description.
Science and Technology Facilities (STFC) DiRAC-funded, PI, Feb. - Dec. 2021. Part of Grant Number: ST/S003916/1.
This grant is a collaboration with CLOSER, UCL, and RITS, UCL. The project will utilise the questions and linked concepts (based on the European Language Social Science Thesaurus (ELSST)) held in CLOSER Discovery. The aim is to create a model that will be able to classify existing questions (and predict from new questions) to these existing concepts in ELSST.
TagItSmart (2015-18) is a Smart Tags driven service platform for enabling ecosystems of connected objects that dynamically change their status in response to a variety of environmental factors and be seamlessly tracked during their lifecycle. TagItSmart is developing smart tags that combine the power of functional inks with the pervasiveness of digital and electronic markers, e.g. dynamic QR codes, NFC tags, in order to capture new contextual information. Beside this, the ubiquitous presence of smartphones with their cameras and NFC readers facilitates seamless observation measurements and lifecycle tracking of smart tag big data.
I design and develop semantic models and reasoning algorithms for capturing the characteristics of the Smart Tags and to provide decision-support mechanisms for connecting their lifecycle data to semantically-enabled workflows.
The EU H2020 frontierCities2 project (December 2016 - November 2018) is a acceleration and incubation programme that aims to accelerate market update of the FIWARE generic enablers in the Internet of Things and Smart Cities domain by targeting SMEs and startups to develop FIWARE-enabled smart mobility and smart city solutions.
I work together with the FIWARE Foundation to deliver technical coaching to the startups and lead the work-package tasked with further developing the FIWARE enablers and support mechanisms.
The iKaaS project (Oct 2015 - Oct 2017), jointly funded by the EC H2020 programme and MIC, Japan, delivers a secure, robust and scalable multi-cloud platform that brings together the paradigms of IoT, Big Data and the Cloud.
I researched aspects of data analysis in smart city platforms built on heterogeneous cloud platforms. As part of this, we developed novel anomaly detection algorithms for city environmental features (such as measured air pollution levels) and real-time detection of city social events by analysing Twitter feeds. The research delivered tools to discover correlation between large-scale city events and anomalies detected in pollution levels.
The IoT.est project (2011-14), funded by the EU FP7 Programme, developed a test-driven service creation environment for Internet of Things enabled business services. I served as the work-package leader for semantic annotation and large-scale discovery of IoT services, as well as the University of Surrey technical coordinator. I was also in-charge of the proof-of-concept project demonstrator that integrated modules from project partners.
The iCore project (2011-14), funded by the EU FP7 Programme, defined the concept of Virtual Objects (VO) to abstract the technological heterogeneity derived from the vast amounts of heterogeneous objects and devices forming part of the IoT. I researched the dynamic association derivation between ICT and real-world objects and service workflow composition in iCore.
The IoT-A project (2010-13), funded by the EU FP7 Programme, was the lighthouse EU project that defined a reference architecture and model for the IoT, including defining its constituent concepts such as entity, resource and IoT service. I was the deputy leader of the WP that researched various mechanisms for resolution framewoks for the IoT.
I worked as a PhD researcher on the Mobile VCE Core 4 Programme on Ubiquitous Services (funded by the UK Technology Strategy Board), work area: ontology-based context management for mobile systems.
Indicators of esteem
Scholarships:
- Overseas Research Student Sponsorship for PhD research: University of Surrey and MobileVCE Core 4 programme
- DFIDSSS Scholarship: jointly funded by the University of Surrey and the British Commonwealth Scholarship Commission for MSc programme.
Awards:
- Cable and Wireless Award: University of Surrey, for the best overall performance from a student graduating with an MSc in Satellite Communication Engineering or Communications Networks and Software
- IET Certificate in recognition of significant contribution to IET On Campus at the University of Surrey
Supervision
Postgraduate research supervision
Postdoctoral Research Fellows (Line manager):
Dr. Chandresh Pravin, "NLP and Text-layout LLMs", 2024 - present, funded by UKRI ESRC.
Dr. Justina Li, "Longitudinal NLP", 2024 - present, funded by UKRI ESRC.
Principal PhD Supervisor: Zeqiang Wang, "Natural Language Processing for Longitudinal Social and Biomedical Science Datasets", (Oct. 2023 - present).
PhD Co-supervisor: Chao Jiang, "Improving inference of Large Language Models", (Jan. 2024 - present).
Collaborative PhD supervisor: Yuqi Wang, Xi’an Jiaotong-Liverpool University, China (Oct. 2021 - present).
Completed postgraduate research projects I have supervised
Co-supervisor (2015-18): Yuchao Zhou: Data-driven Cyber-Physical-Social System for Knowledge Discovery in Smart Cities.
Co-supervisor (2021-22): Tarek Elsaleh, Semantic Data Management for Dynamic Internet of Things (IoT) Services, PhD by published works.
Collaborative PhD supervisor (2017-21): Qi Chen: Distributed Intelligence for Big Smart City Data Processing, Xi’an Jiaotong-Liverpool University, China.
Teaching
2023-24, Spring semester:
Parallel Computing: module lead.
Computer Networks: module lead.
Professional Project (BSc final year project) and MSc dissertation: academic supervisor