Hidden bias in cancer datasets may skew AI diagnoses, says Engineering prof

While artificial intelligence (AI) tools have the potential to drive innovation in medical image processing, Brock University researcher Shahryar Rahnamayan says biases in machine learning models can compromise the reliability of AI-driven diagnostic tools.

“Failure to address bias in medical AI undermines clinical trust and can hinder adoption of potentially life-saving technologies,” he says.

The Professor and Chair of Engineering at Brock is part of a four-person research team that examined potential biases in Cancer Genome Atlas (TCGA) data, a U.S.-based digital repository used for the training and validation of complex machine learning models. Their findings were published last month in Scientific Reports.

AI biases happen when a model’s original training data or algorithm is distorted by biases, leading to inaccurate or even harmful results.

Sampling bias, for example, happens when the training data doesn’t fully reflect the populations the model will serve. Meanwhile, batch effect bias refers to variations in how health-care institutions collect and process samples, including the use of different materials, equipment and techniques.

Rahnamayan and his colleagues found that AI models being trained with TCGA data could be biased because of how tissue sample images in the U.S.-based cancer repository are created.

Collected and submitted by hospitals and other medical institutions, these images of cancer cells support accurate disease diagnosis and prognosis.

When gathering biopsy samples to be digitized, hospitals may use fairly standardized protocols, but information detailing slight variations in materials, equipment and processes — such as different settings or types of dyes to stain samples — is embedded in images.

That information creates a “signature” that is invisible to the human eye but is picked up by AI models, says Rahnamayan.

Rather than only learning the structural features of cancer, he says, models could also unintentionally learn to recognize characteristics associated with the hospitals that provided the sample images.

“This is problematic because the model learns features unrelated to the cancer itself, leading to reduced accuracy when evaluated on external datasets,” he says. “It’s like a doctor saying, ‘I can recognize cancer in St. Michael’s Hospital, but I’m bad at doing that in General Hospital,’” he says.

Because machine learning models are often trained on information provided by a limited number of hospitals, yet they are expected to perform reliably across thousands of medical institutions, Rahnamayan says it is critical to ensure strong generalization.

The research team, which also included scholars from Ontario Tech University and Wilfrid Laurier University, hypothesized that the digitalization process might play a significant role in the signature patterns learned by AI models.

To investigate this theory, they developed a mathematical framework that converted colour images into greyscale, enabling the team to analyze how colour influences the models’ behaviour.

“This study demonstrates that greyscale images carry less bias than color images, leading to improved generalization of the deep neural network model,” says Rahnamayan. “In our future research, we aim to identify and mitigate various biases in medical applications to enhance the accuracy and generalization of deep neural networks.”


Read more stories in: Mathematics and Science, News, Research
Tagged with: , , , , , , , , , , , , , , , ,