Surrey AI research achieves world-leading technology for visual recognition of people

A team from the University of Surrey has created a unique and lightweight deep neural network that could prove to be the new standard in artificial intelligence (AI) applied to video surveillance.

Getty Images

AI is increasingly being used to help human operators handle massive amounts of images from CCTV and other security sources. Person re-identification (ReID) is a method in which an AI is able to recognise images of the same person taken from different cameras or on different occasions. This helps to track suspects across a CCTV network covering large public space, such as an underground network. ReID is challenging for machines as they have to consider and differentiate the same person under different light sources, poses and changes in appearance such as their clothes.

In a paper to be presented at this year’s International Conference on Computer Vision in Seoul, South Korea, the most prestigious conference in visual AI, experts from Surrey’s Centre for Vision, Speech and Signal Processing (CVSSP) detail how they have developed a unique system called OSNet that has outperformed many popular identification systems already in use.

The CVSSP team has shown that OSNet is able to drill down into information from a variety of spatial scales to help accurately make a re-identification - from the smallest details such as the logo on a t-shirt to other, larger factors such as the type of coat worn by the suspect.

Incredibly, OSNet only needs 2.2 million parameters, a very small number in the context of deep neural network models, to outperform many of its competitors built on the popular ResNet50 infrastructure that uses 24 million parameters – suggesting that OSNet could become the standard in visual recognition technology. Such a small parameter size means that the model can be deployed ‘on the edge’, meaning that the heavy computational lifting can be carried out on the camera itself rather than in a remote data centre, saving bandwidth for transmitting large quantities of video data from cameras to the data servers.

Tao Xiang, Distinguished Professor of Computer Vision and Machine Learning at CVSSP, said: “With OSNet, we set out to develop a tool that can overcome many of the person re-identification issues that other set-ups face – but the results far exceeded our expectations. The ReID accuracy achieved by OSNet has clearly surpassed that of human operators.

“OSNet not only shows that it’s capable of outperforming its counterparts on many re-identification problems, but the results are such that we believe it could be used as a stand-alone visual recognition technology in its own right.”

Professor Adrian Hilton, Director of CVSSP, said: “This is a considerable achievement of Prof Xiang and his team in achieving world-leading re- identification technology. Their work on OSNet has the potential to be ground-breaking and could help shape the visual recognition field for years to come. This is a great example of AI and Machine Perception for the benefit of society providing enabling technology for safer public spaces.”

Read the full paper here.