Photograph of Anna Worthy

Anna Worthy


About

My research project

Publications

Anna Worthy, Alistair Mackenzie, Nadia Smith, Caroline Shenton-Taylor (2024) Creation of simulated mammography data to supplement machine learning training datasets

Artificial intelligence has proven useful in the diagnosis of breast cancer from screening mammograms. This paper reports a computational method to correctly simulate training data compatible with the OMI-DB database; one of the largest databases of mammography images for medical research worldwide. Different mammography equipment has varying in-built quality that affects the noise and sharpness properties of an image. The simulation will alter an image to appear as if taken on a different detector, at a different dose or with a different image quality. A Python code has been developed to isolate the electronic, quantum and structural noise coefficients associated with digital mammography detectors for this purpose, building on previous work. A fit between noise power spectra and air kerma is used to find the noise coefficients, which are used with a random phase contribution to create noise images. These noise images are combined with flat-field signals to form simulated images. To simulate the results obtained at one dose from another, a dose factor is introduced to scale the noise contributions. Simulating a mammogram to appear as if taken under different conditions allows a more general training dataset to be created with minimal loss of biological information and without the ethical concerns of taking multiple images of a breast. A tailored dataset could be generated to facilitate an assessment of the performance of artificial intelligence tools for breast cancer detection or breast density calculations.