Dr Félix do Carmo


Senior Lecturer in Translation and Natural Language Processing
BA, MA, PhD - University of Porto (Portugal)
+441483 684118
09 LC 03
Office hours: Wednesdays & Fridays, from 11am to 12 noon.

About

Areas of specialism

Translation technologies; Post-editing of machine translation; Translation process research; Automatic post-editing; Applied translation studies; Natural language processing; Ethical and fair uses of technologies

My qualifications

1992
BA in Modern Languages (ENG-PTG), professionalisation in Translation
University of Porto
1998
MA in Translation Studies
University of Porto
2017
PhD in Language Sciences
University of Porto

Research

Research interests

Research projects

Supervision

Postgraduate research supervision

Teaching

Publications

Highlights

Chapters in books

do Carmo, Félix, and Belinda Maia. 2015. Sleeping with the enemy? Or should translators work with Google Translate? in Pilar Sánchez-Gijón, Olga Torres-Hostench, Bartolomé Mesa-Lao (eds). Conducting Research in Translation Technologies. New Trends in Translation Studies. vol. 13. Peter Lang.

Articles in peer-reviewed journals

 

Conference proceedings

Shterionov, Dimitar, Félix do Carmo, and Joachim Wagner. 2019. “APE through Neural and Statistical MT with Augmented Data - ADAPT/DCU Submission to the WMT 2019 APE Shared Task.” In Proceedings of ACL 2019 - WMT Shared Task on Automatic Post-Editing. Firenze, Italy.

Shterionov, Dimitar, Félix do Carmo, Joss Moorkens, Eric Pacquin, Dag Schmidtke, Declan Groves, and Andy Way. 2019. “When Less Is More in Neural Quality Estimation of Machine Translation - an Industry Case.” In Proceedings of the MT Summit XVII. Dublin.

do Carmo, Félix. 2019 ‘Edit distances do not describe editing, but they can be useful for translation process research’, in Carl, M. and Hansen-Schirra, S. (eds) Proceedings of the 2nd MEMENTO workshop on Modelling Parameters of Cognitive Effort in Translation Production. Dublin, Ireland. pp. 1–2. (Abstract)

do Carmo, Félix. 2018. “Does Machine Translation Really Produce Translations?” In Proceedings of the 21st Annual Conference of the European Association for Machine Translation - Translator’s Track, edited by Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez, Miquel Esplà-Gomis, Maja Popović, Célia Rico Pérez, André Martins, Joachim Van den Bogaert, and Mikel L. Forcada, 323. Alicante, Spain: EAMT. p. 323. (Abstract).

do Carmo, Félix, Luís Trigo, and Belinda Maia. 2016. From CATs to KATs. in Proceedings of the 38th Conference Translating and the Computer. London, UK: Editions Tradulex, Geneva. pp. 149–158.

do Carmo, Félix, and Belinda Maia. 2016. “A Description of Post-Editing, from Translation Studies to Machine Learning.” In Tradumàtica Research Group (eds.). Translators and Machine Translation: Book of presentations. Barcelona, Spain. pp. 126-152.

Sarah Herbert, Félix do Carmo, Joanna Gough Results from project JAS (Job Allocation System) University of Surrey

This dataset includes the data collected as part of the projects JAS, a study on the effects of automation of a job allocation system in a translation services company. The dataset includes counts of answers to a questionnaire answered by 38 participants. Answers are classified according to closed classes in closed questions, and thematic codes in answers to open questions. See readme file for more details and read the article "From responsibilities to responsibility: a study of the effects of translation workflow automation", due to be published in JoSTrans, Issue 40 (July 2023).

Félix do Carmo, Joss Moorkens (2022)Translation's new high-tech clothes, In: The Human Translator in the 2020s Routledge
Hadeel Saadany, Constantin Orasan, Rocio Caro Quintana, Felix do Carmo, Leonardo Zilio Challenges in Translation of Emotions in Multilingual User-Generated Content: Twitter as a Case Study, In: arXiv (Cornell University)

Linguistik International 2020 Although emotions are universal concepts, transferring the different shades of emotion from one language to another may not always be straightforward for human translators, let alone for machine translation systems. Moreover, the cognitive states are established by verbal explanations of experience which is shaped by both the verbal and cultural contexts. There are a number of verbal contexts where expression of emotions constitutes the pivotal component of the message. This is particularly true for User-Generated Content (UGC) which can be in the form of a review of a product or a service, a tweet, or a social media post. Recently, it has become common practice for multilingual websites such as Twitter to provide an automatic translation of UGC to reach out to their linguistically diverse users. In such scenarios, the process of translating the user's emotion is entirely automatic with no human intervention, neither for post-editing nor for accuracy checking. In this research, we assess whether automatic translation tools can be a successful real-life utility in transferring emotion in user-generated multilingual data such as tweets. We show that there are linguistic phenomena specific of Twitter data that pose a challenge in translation of emotions in different languages. We summarise these challenges in a list of linguistic features and show how frequent these features are in different language pairs. We also assess the capacity of commonly used methods for evaluating the performance of an MT system with respect to the preservation of emotion in the source text.

Shenbin Qian, Constantin Orasan, Diptesh Kanojia, Hadeel Saadany, Felix do Carmo (2022)SURREY-CTS-NLP at WASSA2022:An Experiment of Discourse and Sentiment Analysis for the Prediction of Empathy, Distress and Emotion, In: PROCEEDINGS OF THE 12TH WORKSHOP ON COMPUTATIONAL APPROACHES TO SUBJECTIVITY, SENTIMENT & SOCIAL MEDIA ANALYSISpp. 271-275 Assoc Computational Linguistics-Acl

This paper summarises the submissions our team, SURREY-CTS-NLP has made for the WASSA 2022 Shared Task for the prediction of empathy, distress and emotion. In this work, we tested different learning strategies, like ensemble learning and multi-task learning, as well as several large language models, but our primary focus was on analysing and extracting emotion-intensive features from both the essays in the training data and the news articles, to better predict empathy and distress scores from the perspective of discourse and sentiment analysis. We propose several text feature extraction schemes to compensate the small size of training examples for fine-tuning pretrained language models, including methods based on Rhetorical Structure Theory (RST) parsing, cosine similarity and sentiment score. Our best submissions achieve an average Pearson correlation score of 0.518 for the empathy prediction task and an F1 score of 0.571 for the emotion prediction task(1), indicating that using these schemes to extract emotion-intensive information can help improve model performance.

Félix do Carmo, Joss Moorkens, Felix Emanuel Martins Do Carmo (2020)Differentiating Editing, Post-Editing and Revision, In: Routledge eBookspp. 35-49 Informa
Dimitar Shterionov, Felix do Carmo, Joss Moorkens, Murhaf Hossari, Joachim Wagner, Eric Paquin, Dag Schmidtke, Declan Groves, Andy Way (2020)A roadmap to neural automatic post-editing: an empirical approach, In: Machine translation34(2-3)pp. 67-96 Springer Nature

In a translation workflow, machine translation (MT) is almost always followed by a human post-editing step, where the raw MT output is corrected to meet required quality standards. To reduce the number of errors human translators need to correct, automatic post-editing (APE) methods have been developed and deployed in such workflows. With the advances in deep learning, neural APE (NPE) systems have outranked more traditional, statistical, ones. However, the plethora of options, variables and settings, as well as the relation between NPE performance and train/test data makes it difficult to select the most suitable approach for a given use case. In this article, we systematically analyse these different parameters with respect to NPE performance. We build an NPE "roadmap" to trace the different decision points and train a set of systems selecting different options through the roadmap. We also propose a novel approach for APE with data augmentation. We then analyse the performance of 15 of these systems and identify the best ones. In fact, the best systems are the ones that follow the newly-proposed method. The work presented in this article follows from a collaborative project between Microsoft and the ADAPT centre. The data provided by Microsoft originates from phrase-based statistical MT (PBSMT) systems employed in production. All tested NPE systems significantly increase the translation quality, proving the effectiveness of neural post-editing in the context of a commercial translation workflow that leverages PBSMT.

Félix do Carmo, Dimitar Shterionov, Joss Moorkens, Joachim Wagner, Murhaf Hossari, Eric Paquin, Dag Schmidtke, Declan Groves, Andy Way (2021)A review of the state-of-the-art in automatic post-editing, In: Machine translation35(2)pp. 101-143 Springer Netherlands

This article presents a review of the evolution of automatic post-editing, a term that describes methods to improve the output of machine translation systems, based on knowledge extracted from datasets that include post-edited content. The article describes the specificity of automatic post-editing in comparison with other tasks in machine translation, and it discusses how it may function as a complement to them. Particular detail is given in the article to the five-year period that covers the shared tasks presented in WMT conferences (2015–2019). In this period, discussion of automatic post-editing evolved from the definition of its main parameters to an announced demise, associated with the difficulties in improving output obtained by neural methods, which was then followed by renewed interest. The article debates the role and relevance of automatic post-editing, both as an academic endeavour and as a useful application in commercial workflows.

Felix do Carmo (2020)'Time is money' and the value of translation, In: TRANSLATION SPACES9(1)pp. 35-57 John Benjamins Publishing Co

This article uses a multi-faceted approach to discuss the relation between time, money and different perspectives that help define the value of professional translation. It challenges the narratives created by the translation industry on post-editing as a revision of pre-translated content, confronting them with the detailed description of the task in industry standards and with the reality of translators' work. The article also addresses the different roles that time plays as an instrument of analysis and evaluation of translation, and as a fundamental factor in the definition of labour relations in the translation market. The main claim of the article is that translation is increasingly specialised high-value work, requiring translators that are able to make complex and efficient decisions, especially when they are expected to work under time restrictions, with the support of content that has been previously processed by machine translation.

Dorothy Kenny, Joss Moorkens, Felix do Carmo (2020)Fair MT Towards ethical, sustainable machine translation INTRODUCTION, In: TRANSLATION SPACES9(1)pp. 1-11 John Benjamins Publishing Co
Pilar Sanchez-Gijon, Bartolome Mesa-Lao, Olga Torres-Hostench, Felix Emanuel Martins Do Carmo (2015)Conducting research in translation technologies Peter Lang

The literature on translation and technology has generally taken two forms: general overviews, in which the tools are described, and functional descriptions of how such tools and technologies are implemented in specific projects, often with a view to improving the quality of translator training. There has been far less development of the deeper implications of technology in its cultural, ethical, political and social dimensions. In an attempt to address this imbalance, the present volume offers a collection of articles, written by leading experts in the field, that explore some of the current communicational and informational trends that are defining our contemporary world and impinging on the translation profession. The contributions have been divided into three main areas in which translation and technology come together: (1) social spheres, (2) education and training and (3) research. This volume represents a bold attempt at contextualizing translation technologies and their applications within a broader cultural landscape and encourages intellectual reflection on the crucial role played by technology in the translation profession.

Maarit Koponen, Brian Mossop, Isabelle S Robert, Giovanna Scocchera, Felix Emanuel Martins Do Carmo (2020)Translation revision and post-editing: industry practices and cognitive processes Routledge

Translation Revision and Post-editing looks at the apparently dissolving boundary between correcting translations generated by human brains and those generated by machines. It presents new research on post-editing and revision in government and corporate translation departments, translation agencies, the literary publishing sector and the volunteer sector, as well as on training in both types of translation checking work.This collection includes empirical studies based on surveys, interviews and keystroke logging, as well as more theoretical contributions questioning such traditional distinctions as translating versus editing. The chapters discuss revision and post-editing involving eight languages: Afrikaans, Catalan, Dutch, English, Finnish, French, German and Spanish. Among the topics covered are translator/reviser relations and revising/post-editing by non-professionals.The book is key reading for researchers, instructors and advanced students in Translation Studies as well as for professional translators with a special interest in checking translations.

The amazing capacities of machine translation are supported by very rigorous and powerful research. However, science is also discourse, and sometimes scientific discourse creates myths, beliefs that are based on how terms and concepts may be used in scientific publications with no proper debate or understanding. In this lecture, I will present a critical view of three of the most influential papers from machine translation research, not criticising their scientific validity, but highlighting how their use of terms and concepts helped create myths around the power of machine translation. My perspective is that translation is much more complex than what common discourses about machine translation convey, and that we are losing sight of that complexity when we focus on the scientific achievements. My objective is to contribute to real convergence between machine translation research and translation studies by presenting a view that aims at solving current limitations of discussions about translation. I believe that real convergence can only be fruitful if translation studies contributes to the debate, bringing with it the power of a rich legacy of theories and practices that help us all understand the complexity of translation.

Additional publications