
Girish Arun Koushik
Academic and research departments
Nature Inspired Computing and Engineering Research Group, Computer Science Research Centre, School of Computer Science and Electronic Engineering.About
My research project
A Digitally Resilient Framework for Hate Speech DetectionHateful comments and posts on social media platforms have become an important issue in recent times. Such hate speech can be directed against people based on their race, ethnicity, religion, sex, sexual orientation, disability or even their political stance. Hate speech can occur in various languages through text, audio, and visual modalities, making its automatic detection challenging for artificial intelligence (AI) algorithms. This project investigates the detection of hate speech in combined multimodal and multilingual settings. This will be achieved by developing state-of-the-art AI algorithms which combine textual, auditory, and visual modalities. These algorithms will lead to a resilient framework providing a holistic approach to analysing hate speech in various forms and languages. The resulting framework shall be open-sourced to allow further academic research and deployment by social media platforms.
Supervisors
Hateful comments and posts on social media platforms have become an important issue in recent times. Such hate speech can be directed against people based on their race, ethnicity, religion, sex, sexual orientation, disability or even their political stance. Hate speech can occur in various languages through text, audio, and visual modalities, making its automatic detection challenging for artificial intelligence (AI) algorithms. This project investigates the detection of hate speech in combined multimodal and multilingual settings. This will be achieved by developing state-of-the-art AI algorithms which combine textual, auditory, and visual modalities. These algorithms will lead to a resilient framework providing a holistic approach to analysing hate speech in various forms and languages. The resulting framework shall be open-sourced to allow further academic research and deployment by social media platforms.
Publications
Social media platforms enable the propagation of hateful content across different modalities such as textual, auditory, and visual, necessitating effective detection methods. While recent approaches have shown promise in handling individual modalities, their effectiveness across different modality combinations remains unexplored. This paper presents a systematic analysis of fusion-based approaches for multimodal hate detection, focusing on their performance across video and image-based content. Our comprehensive evaluation reveals significant modality-specific limitations: while simple embedding fusion achieves state-of-the-art performance on video content (HateMM dataset) with a 9.9% points F1-score improvement, it struggles with complex image-text relationships in memes (Hateful Memes dataset). Through detailed ablation studies and error analysis, we demonstrate how current fusion approaches fail to capture nuanced cross-modal interactions, particularly in cases involving benign confounders. Our findings provide crucial insights for developing more robust hate detection systems and highlight the need for modality-specific architectural considerations. The code is available at https://github.com/gak97/Video-vs-Meme-Hate.
Reinforcement Learning (RL) mathematically formulates decision-making with Markov Decision Process (MDP). With MDPs, researchers have achieved remarkable breakthroughs across various domains, including games, robotics, and language models. This paper seeks a new possibility, Natural Language Reinforcement Learning (NLRL), by extending traditional MDP to natural language-based representation space. Specifically, NLRL innovatively redefines RL principles, including task objectives, policy, value function, Bellman equation, and policy iteration, into their language counterparts. With recent advancements in large language models (LLMs), NLRL can be practically implemented to achieve RL-like policy and value improvement by either pure prompting or gradient-based training. Experiments over Maze, Breakthrough, and Tic-Tac-Toe games demonstrate the effectiveness, efficiency, and interpretability of the NLRL framework among diverse use cases. Our code will be released at https://github.com/waterhorse1/Natural-language-RL