Deep Learning for Audio-Visual Scene Analysis

A fully-funded PhD studentship available for UK and international applicants in the area of Deep Learning for Audio-Visual Scene Analysis.

Start date

1 April 2025

Duration

3 years

Application deadline

Funding information

  • Full UK/EU/International tuition fees are covered for 3 years
  • Stipend at £19,237 p.a. (2024/25) for 3 years initially and can be extended for up to 6 months. The stipend will increase each year in line with the UK Research and Innovation (UKRI) rate
  • International students are also welcomed to apply
  • For exceptional international candidates, there is the possibility of obtaining a scholarship to cover overseas fees.

About

The University of Surrey is offering a fully funded PhD studentship on the topics of Deep Learning for Audio-Visual Scene Analysis, with industrial partner Bang & Olufsen. This project aims to develop new deep learning methods for audio-visual scene analysis in a smart home environment. This will involve the use of heterogeneous sensors, e.g. microphones and cameras, for analysing various sources and events present in the acoustic environment. Tasks to be considered include audio-visual source separation, localization/tracking, and audio-visual event detection/recognition. 

Successful candidates will be supervised by Professor Wenwu Wang and Professor Philip Jackson in the Centre for Vision, Speech and Signal Processing (CVSSP) and Surrey Institute for People Centred Artificial Intelligence, at the University of Surrey. The PhD student will be based at the CVSSP. They will benefit from resources from CVSSP, the Surrey Institute for People-Centred AI, and potential secondment opportunities at Bang & Olufsen.

The studentship is available for an earlier start in January 2025 for UK students only. 

Eligibility criteria

Open to any UK or international candidates.

You will need to meet the minimum entry requirements for our Vision, Speech and Signal Processing PhD programme.

All applicants should have (or expect to obtain) a first-class degree in a numerate discipline (mathematics, science or engineering) or MSc with Distinction (or 70% average) and a strong interest in pursuing research in this field. Additional experience which is relevant to the area of research is also advantageous.

English language requirements

IELTS Academic 6.5 or above (or equivalent) with 6.0 in each individual category.

How to apply

Applications should be submitted via our Vision, Speech and Signal Processing PhD programme page. In place of a research proposal, you should upload a document stating the title of the project that you wish to apply for and the name of the relevant supervisor.

Studentship FAQs

Read our studentship FAQs to find out more about applying and funding.

Application deadline

Contact details

Wenwu Wang
06 BB 01
Telephone: +44 (0)1483 686039
E-mail: W.Wang@surrey.ac.uk
Philip Jackson
07 BB 01
Telephone: +44 (0)1483 686044
E-mail: p.jackson@surrey.ac.uk
studentship-cta-strip

Studentships at Surrey

We have a wide range of studentship opportunities available.