- Vision, Speech and Signal Processing
PhD Vision, Speech and Signal Processing
At the Centre for Vision, Speech and Signal Processing (CVSSP), we’re developing exciting and ground-breaking technologies. These include facial recognition for security and medical imaging, and 3D spatial audio and 3D reconstruction from video for visual-effects production in films, games and virtual reality.
Why choose this
programme?
Our research is creating machines that can see, hear and understand the world around them.
CVSSP is one of the largest audio and vision research groups in Europe, and we’re internationally recognised for our pioneering research and novel technologies in audio-visual machine perception. We bring together a unique combination of cutting-edge sound and vision expertise with more than 170 researchers, and we’ve currently attracted more than £30 million in research grants.
Our research aims to advance the state of the art in audio-visual signal processing, computer vision and machine learning, with a focus on image, video and audio applications. We have expertise and activities in computer vision, digital signal processing, machine learning and artificial intelligence, computer graphics and human computer interaction, and data science for media, medical image analysis and multimedia communication.
We often collaborate with industry and our research has led to frequent technology transfer and exploitation, with previous research projects having resulted in award-winning spin-out companies in biometrics, communications and medical technologies, and in the creative industries.
In the Research Excellence Framework (REF) 2021, the University of Surrey ranks 15th in the UK for research power for engineering and top 20 in the UK for the overall quality of research outputs (research papers and other published works).
Studentships
Deep Learning for Audio-Visual Scene Analysis
Full UK/EU/International tuition fees are covered for 3 yearsStipend at £19,237 p.a. (2024/25) for 3 years initially and can be extended for up to 6 months. The stipend will increase each year in line with the UK Research and Innovation (UKRI) rateInternational students are also welcomed to applyFor exceptional international candidates, there is the possibility of obtaining a scholarship to cover overseas fees.
Deep Learning for Audio-Visual Scene Analysis
Stipend at £19,237 (2024/25) per annum, which will increase each year in line with the UK Research and Innovation (UKRI) rate.
What you will study
Our PhD programme takes a maximum of four years of full-time study to complete. After 12 months, you’ll write a confirmation report, which is assessed by two independent examiners. After that, you’ll submit a written PhD thesis after a minimum of three years of full-time study.
You’ll be allocated two Surrey-based academic supervisors, in addition to any external collaborative supervisors. Your principal supervisor will be an expert in your area of research and they will monitor your research progress on a regular basis. Your supervisors will help you define the initial objective and scope of your research, and to refine these as your project evolves. They’ll direct you to resources and they’ll be able to advise you on how to complete your PhD and your thesis. We often appoint external collaborative supervisors to contribute specific expertise or to allow access to external resources or organisations.
You’ll also be assigned to a research group that includes a team of academics, postdoctoral researchers, guest scientists and fellows. Lots of our research at CVSSP is interdisciplinary and you’ll have the opportunity to collaborate with scientists at universities, research establishments and industries around the world. We encourage active interaction with your peers and our researchers and academics, and we offer a friendly environment that nurtures openness and collaboration. You’ll be encouraged to present your research at renowned national and international conferences to gain experience and establish networks with leading researchers.
Our monthly seminars are open to all postgraduate researchers and we host leading experts from other institutions within the UK and from overseas, who give talks to members of CVSSP in specific areas of research. In addition, our postgraduate researchers also attend regular internal seminars, where you’ll be able to present your individual research or practice presentations you’d like to give at conferences or events.
The University also holds an annual postgraduate researcher conference on campus, where you’ll be able to showcase your work and network with other researchers and academics.
You may be eligible to apply for membership with the British Machine Vision Association, the Audio Engineering Society, the British Computer Society and the Institute of Engineering and Technology. You may also apply for chartered engineer status with the Engineering Council UK and with the Institute of Electrical and Electronics Engineers.
Assessment
Your final assessment will be based on the presentation of your research in a written thesis, which will be discussed in a viva examination with at least two examiners. You have the option of preparing your thesis as a monograph (one large volume in chapter form) or in publication format (including chapters written for publication), subject to the approval of your supervisors.
Location
Stag Hill is the University's main campus and where the majority of our courses are taught.
Research themes
Our research activities within CVSSP are grouped into the following areas:
- Creative vision and sound focuses on machine perception, spatial audio and machine audition, specialising in 4D immersive VR content production for film and games, including performance capture, animation, visual action recognition and audio-visual scene understanding
- Healthcare focuses on medical imaging technologies and works closely with leading healthcare institutions
- Robotics focuses on autonomous systems, covering a broad range of technologies relating to visual interaction with computers, this includes sign language recognition and autonomous vehicles
- Security focuses on biometric and security-related technologies, specialising in facial biometrics and lip tracking
- Data focuses on understanding and preservation including visual recognition, distributed ledger technologies and the understanding of AI systems
- Distributed ledger technologies (DTL) focuses on alternative uses for DLT, including safe online identity, healthcare and secure digital archives.
Our six research themes cover a range of topics, including, but not limited to, the following:
- Computer vision
- Machine learning
- Robotics and autonomous systems
- 3D and 4D video
- 3D spatial audio
- Biometrics
- Blind source separation
- Coding and transmission
- Facial analysis
- Human motion analysis
- Interfaces/visual interaction
- Media adaptation
- Media networking
- Medical image acquisition
- Medical image analysis
- Quality of experience
- Sign and gesture analysis
- Speech and audio processing
- Surveillance
- Video and audio retrieval.
See a full list of all our academic staff within the Centre for Vision, Speech and Signal Processing.
Research support
The professional development of postgraduate researchers is supported by the Doctoral College, which provides training in essential skills through its Researcher Development Programme of workshops, mentoring and coaching. A dedicated postgraduate careers and employability team will help you prepare for a successful career after the completion of your PhD.
Facilities
We host a number of cutting-edge laboratory facilities in CVSSP to support the exciting research being carried out by all its members and associates.
4D computer vision
The Audio-Visual Lab hosts a state-of-the-art capture studio with unique multiple UltraHD cameras supporting research in real-time audio-visual processing and visualisation. We collaborate with companies specialising in film, TV, games, and virtual and augmented reality.
Spatial audio and machine audition
The Audio Lab facilities include a purpose-built audio booth and the Surrey Sound Sphere, which is the centrepiece of our cutting-edge audio and acoustics research. It includes 64 Genelec speakers with audio interfaces, 48 configurable channels, a pre-amplified microphone array and an acoustically isolated audio booth.
The Surrey Sound Sphere has supported research on personal sound zones, human sound localisation and object-based 3D spatial audio as part of the S3A research collaboration. This is supported by the Engineering and Physical Sciences Research Council (EPSRC).
Biometrics and face recognition
We have state-of-the-art equipment supporting research into 3D face recognition, including facial feature types, emotion recognition and face models for biometrics. This was part of the FACER2VM EPSRC Programme Grant.
Robotics
Our Robot Lab supports research into autonomous systems, collaborative mapping, autonomous navigation and robotic machine learning, and links closely with our expertise in computer vision, artificial intelligence and machine learning. Facilities include a Baxter robot and various mobile robot platforms.
UK qualifications
Applicants are expected to hold a first or upper second-class (2:1) UK degree in a relevant discipline (or equivalent overseas qualification), or a lower-second (2:2) UK degree plus a good UK masters degree - distinction normally required (or equivalent overseas qualification).
English language requirements
IELTS Academic: 6.5 or above (or equivalent) with 6.0 in each individual category.
These are the English language qualifications and levels that we can accept.
If you do not currently meet the level required for your programme, we offer intensive pre-sessional English language courses, designed to take you to the level of English ability and skill required for your studies here.
Selection process
Selection is based on applicants:
- Meeting the expected entry requirements
- Being shortlisted through the application screening process
- Completing a successful interview
- Providing suitable references.
Fees per year
Explore UKCISA’s website for more information if you are unsure whether you are a UK or overseas student. View the list of fees for all postgraduate research courses.
October 2025 - Full-time
- UK
- To be confirmed
- Overseas
- To be confirmed
October 2025 - Part-time
- UK
- To be confirmed
- Overseas
- To be confirmed
July 2025 - Full-time
- UK
- £4,786
- Overseas
- £26,200
July 2025 - Part-time
- UK
- £2,393
- Overseas
- £13,100
January 2026 - Part-time
- UK
- To be confirmed
- Overseas
- To be confirmed
January 2026 - Full-time
- UK
- To be confirmed
- Overseas
- To be confirmed
April 2025 - Full-time
- UK
- £4,786
- Overseas
- £26,200
April 2025 - Part-time
- UK
- £2,393
- Overseas
- £13,100
- Annual fees will increase by 4% for each year of study, rounded up to the nearest £100 (subject to legal requirements).
- Any start date other than September will attract a pro-rata fee for that year of entry (75 per cent for January, 50 per cent for April and 25 per cent for July).
Additional costs
There are additional costs that you can expect to incur when studying at Surrey.
Funding
A Postgraduate Doctoral Loan can help with course fees and living costs while you study a postgraduate doctoral course.
Studentships
Browse our frequently updated list of funded studentships open for applications.
Deep Learning for Audio-Visual Scene Analysis
Full UK/EU/International tuition fees are covered for 3 yearsStipend at £19,237 p.a. (2024/25) for 3 years initially and can be extended for up to 6 months. The stipend will increase each year in line with the UK Research and Innovation (UKRI) rateInternational students are also welcomed to applyFor exceptional international candidates, there is the possibility of obtaining a scholarship to cover overseas fees.
Deep Learning for Audio-Visual Scene Analysis
Stipend at £19,237 (2024/25) per annum, which will increase each year in line with the UK Research and Innovation (UKRI) rate.
Application process
Applicants are advised to contact potential supervisors before they submit an application via the website. Please refer to section two of our application guidance.
After registration
Students are initially registered for a PhD with probationary status and, subject to satisfactory progress, subsequently confirmed as having PhD status.
About the University of Surrey
Need more information?
Contact our Admissions team or talk to a current University of Surrey student online.
Code of practice for research degrees
Surrey’s postgraduate research code of practice sets out the University's policy and procedural framework relating to research degrees. The code defines a set of standard procedures and specific responsibilities covering the academic supervision, administration and assessment of research degrees for all faculties within the University.
Download the code of practice for research degrees (PDF).
Terms and conditions
When you accept an offer to study at the University of Surrey, you are agreeing to follow our policies and procedures, student regulations, and terms and conditions.
We provide these terms and conditions in two stages:
- First when we make an offer.
- Second when students accept their offer and register to study with us (registration terms and conditions will vary depending on your course and academic year).
View our generic registration terms and conditions (PDF) for the 2023/24 academic year, as a guide on what to expect.
Disclaimer
This online prospectus has been published in advance of the academic year to which it applies.
Whilst we have done everything possible to ensure this information is accurate, some changes may happen between publishing and the start of the course.
It is important to check this website for any updates before you apply for a course with us. Read our full disclaimer.