Dr Guoyang Xie
Publications
The problem of how to assess cross-modality medical image synthesis has been largely unexplored. The most used measures like PSNR and SSIM focus on analyzing the structural features but neglect the crucial lesion location and fundamental k-space speciality of medical images. To overcome this problem, we propose a new metric K-CROSS to spur progress on this challenging problem. Specifically, K-CROSS uses a pre-trained multi-modality segmentation network to predict the lesion location, together with a tumor encoder for representing features, such as texture details and brightness intensities. To further reflect the frequency-specific information from the magnetic resonance imaging principles, both k-space features and vision features are obtained and employed in our comprehensive encoders with a frequency reconstruction penalty. The structure-shared encoders are designed and constrained with a similarity loss to capture the intrinsic common structural information for both modalities. As a consequence, the features learned from lesion regions, k-space, and anatomical structures are all captured, which serve as our quality evaluators. We evaluate the performance by constructing a large-scale cross-modality neuroimaging perceptual similarity (NIRPS) dataset with 6,000 radiologist judgments. Extensive experiments demonstrate that the proposed method outperforms other metrics, especially in comparison with the radiologists on NIRPS.
Abstract The widely employed tiny neural networks (TNNs) in mobile devices are vulnerable to adversarial attacks. However, more advanced research on the robustness of TNNs is highly in demand. This work focuses on improving the robustness of TNNs without sacrificing the model’s accuracy. To find the optimal trade-off networks in terms of the adversarial accuracy, clean accuracy, and model size, we present TAM-NAS, a tiny adversarial multi-objective one-shot network architecture search method. First, we build a novel search space comprised of new tiny blocks and channels to establish a balance between the model size and adversarial performance. Then, we demonstrate how the supernet facilitates the acquisition of the optimal subnet under white-box adversarial attacks, provided that the supernet significantly impacts the subnet’s performance. Concretely, we investigate a new adversarial training paradigm by evaluating the adversarial transferability, the width of the supernet, and the distinction between training subnets from scratch and fine-tuning. Finally, we undertake statistical analysis for the layer-wise combination of specific blocks and channels on the first non-dominated front, which can be utilized as a design guideline for the design of TNNs.
•FedMed-GAN, to the best of our knowledge, is the first work to establish a new benchmark for federally cross-modality brain image synthesis, which greatly facilitates the development of medical GAN with differential privacy guarantees.•We provide comprehensive explanations for treating mode collapse and performance drop compared to centralized training.•The proposed work simulates as much as possible proportions of unpaired and paired data for each client with various data distributions for all clients. The performance of FedMed-GAN remains stable when facing long-tail data distributions. Utilizing multi-modal neuroimaging data is proven to be effective in investigating human cognitive activities and certain pathologies. However, it is not practical to obtain the full set of paired neuroimaging data centrally since the collection faces several constraints, e.g., high examination cost, long acquisition time, and image corruption. In addition, these data are dispersed into different medical institutions and thus cannot be aggregated for centralized training considering the privacy issues. There is a clear need to launch federated learning and facilitate the integration of dispersed data from different institutions. In this paper, we propose a new benchmark for federated domain translation on unsupervised brain image synthesis (FedMed-GAN) to bridge the gap between federated learning and medical GAN. FedMed-GAN mitigates the mode collapse without sacrificing the performance of generators, and is widely applied to different proportions of unpaired and paired data with variation adaptation properties. We treat the gradient penalties using the federated averaging algorithm and then leverage the differential privacy gradient descent to regularize the training dynamics. A comprehensive evaluation is provided for comparing FedMed-GAN and other centralized methods, demonstrating that the proposed algorithm outperforms the state-of-the-art. Our code is available at: https://github.com/M-3LAB/FedMed-GAN.
Multi-modality imaging improves disease diagnosis and reveals distinct deviations in tissues with anatomical properties. The existence of completely aligned and paired multi-modality neuroimaging data has proved its effectiveness in brain research. However, collecting fully aligned and paired data is expensive or even impractical, since it faces many difficulties, including high cost, long acquisition time, image corruption, and privacy issues. An alternative solution is to explore unsupervised or weakly supervised learning methods to synthesize the absent neuroimaging data. In this paper, we provide a comprehensive review of cross-modality synthesis for neuroimages, from the perspectives of weakly supervised and unsupervised settings, loss functions, evaluation metrics, imaging modalities, datasets, and downstream applications based on synthesis. We begin by highlighting several opening challenges for cross-modality neuroimage synthesis. Then, we discuss representative architectures of cross-modality synthesis methods under different supervisions. This is followed by a stepwise in-depth analysis to evaluate how cross-modality neuroimage synthesis improves the performance of its downstream tasks. Finally, we summarize the existing research findings and point out future research directions. All resources are available at https://github.com/M-3LAB/awesome-multimodal-brain-image-systhesis.
Data augmentation is a promising technique for unsupervised anomaly detection in industrial applications, where the availability of positive samples is often limited due to factors such as commercial competition and sample collection difficulties. In this paper, how to effectively select and apply data augmentation methods for unsupervised anomaly detection is studied. The impact of various data augmentation methods on different anomaly detection algorithms is systematically investigated through experiments. The experimental results show that the performance of different industrial image anomaly detection (termed as IAD) algorithms is not significantly affected by the specific data augmentation method employed and that combining multiple data augmentation methods does not necessarily yield further improvements in the accuracy of anomaly detection, although it can achieve excellent results on specific methods. These findings provide useful guidance on selecting appropriate data augmentation methods for different requirements in IAD.
Image anomaly detection (IAD) is an emerging and vital computer vision task in industrial manufacturing (IM). Recently many advanced algorithms have been published, but their performance deviates greatly. We realize that the lack of actual IM settings most probably hinders the development and usage of these methods in real-world applications. As far as we know, IAD methods are not evaluated systematically. As a result, this makes it difficult for researchers to analyze them because they are designed for different or special cases. To solve this problem, we first propose a uniform IM setting to assess how well these algorithms perform, which includes several aspects, i.e., various levels of supervision (unsupervised vs. semi-supervised), few-shot learning, continual learning, noisy labels, memory usage, and inference speed. Moreover, we skillfully build a comprehensive image anomaly detection benchmark (IM-IAD) that includes 16 algorithms on 7 mainstream datasets with uniform settings. Our extensive experiments (17,017 in total) provide in-depth insights for IAD algorithm redesign or selection under the IM setting. Next, the proposed benchmark IM-IAD gives challenges as well as directions for the future. To foster reproducibility and accessibility, the source code of IM-IAD is uploaded on the website, https://github.com/M-3LAB/IM-IAD.
In the area of fewshot anomaly detection (FSAD), efficient visual feature plays an essential role in memory bank M-based methods. However, these methods do not account for the relationship between the visual feature and its rotated visual feature, drastically limiting the anomaly detection performance. To push the limits, we reveal that rotation-invariant feature property has a significant impact in industrial-based FSAD. Specifically, we utilize graph representation in FSAD and provide a novel visual isometric invariant feature (VIIF) as anomaly measurement feature. As a result, VIIF can robustly improve the anomaly discriminating ability and can further reduce the size of redundant features stored in M by a large amount. Besides, we provide a novel model GraphCore via VIIFs that can fast implement unsupervised FSAD training and can improve the performance of anomaly detection. A comprehensive evaluation is provided for comparing GraphCore and other SOTA anomaly detection models under our proposed fewshot anomaly detection setting, which shows GraphCore can increase average AUC by 5.8%, 4.1%, 3.4%, and 1.6% on MVTec AD and by 25.5%, 22.0%, 16.9%, and 14.1% on MPDD for 1, 2, 4, and 8-shot cases, respectively.
The recent rapid development of deep learning has laid a milestone in industrial image anomaly detection (IAD). In this paper, we provide a comprehensive review of deep learning-based image anomaly detection techniques, from the perspectives of neural network architectures, levels of supervision, loss functions, metrics and datasets. In addition, we extract the promising setting from industrial manufacturing and review the current IAD approaches under our proposed setting. Moreover, we highlight several opening challenges for image anomaly detection. The merits and downsides of representative network architectures under varying supervision are discussed. Finally, we summarize the research findings and point out future research directions. More resources are available at https://github.com/M-3LAB/awesome-industrial-anomaly-detection .
Data augmentation is a promising technique for unsupervised anomaly detection in industrial applications, where the availability of positive samples is often limited due to factors such as commercial competition and sample collection difficulties. In this paper, how to effectively select and apply data augmentation methods for unsupervised anomaly detection is studied. The impact of various data augmentation methods on different anomaly detection algorithms is systematically investigated through experiments. The experimental results show that the performance of different industrial image anomaly detection (termed as IAD) algorithms is not significantly affected by the specific data augmentation method employed and that combining multiple data augmentation methods does not necessarily yield further improvements in the accuracy of anomaly detection, although it can achieve excellent results on specific methods. These findings provide useful guidance on selecting appropriate data augmentation methods for different requirements in IAD.
The recent rapid development of deep learning has laid a milestone in industrial Image Anomaly Detection (IAD). In this paper, we provide a comprehensive review of deep learning-based image anomaly detection techniques, from the perspectives of neural network architectures, levels of supervision, loss functions, metrics and datasets. In addition, we extract the new setting from industrial manufacturing and review the current IAD approaches under our proposed our new setting. Moreover, we highlight several opening challenges for image anomaly detection. The merits and downsides of representative network architectures under varying supervision are discussed. Finally, we summarize the research findings and point out future research directions. More resources are available at https://github.com/M-3LAB/awesome-industrial-anomaly-detection.