Reacción a "A study shows that an AI-based tool can determine a woman's risk of developing breast cancer in the next four years"

Ignacio Miranda Gómez

Head of the Breast Imaging Unit at the International Breast Cancer Center (IBCC) and at the Teknon Medical Center in Barcelona.

International Breast Cancer Center IBCC (Barcelona)

Teknon Medical Center

This is a population cohort study with a very large training cohort (~397,000 women) and an independent validation cohort (~96,000 women). A posteriori, external validation was performed on a different population in Sweden (~4,500 women).

The size of the training cohort is enormous, which favours stable estimates, facilitates robust population calibration, and reduces the risk of overfitting, which occurs when a model memorises the training data too accurately, including anomalies, instead of learning general patterns. As a result, the model works extremely well with the data it already knows, but fails to generalise or make accurate predictions with new data.

All of the above is a great strength compared to many AI studies with small samples.

The study starts with an AI detection algorithm (BRAIx AI Reader) and transforms it into a population-calibrated risk score (BRAIx risk score) with two fundamental objectives:

Cancer risk in current screening.
Prediction of cancer risk in the following four years.

The study does not use AI solely to detect cancer, but rather converts it into a tool for population risk stratification. This has important practical implications for the implementation of personalised screening, the rational use of higher-cost tests such as contrast mammography or magnetic resonance imaging, and a potential reduction in overdiagnosis.

I believe that the press release conveys the key points of the study. It should only be clarified that when comparing the risk (2% score) with patients carrying BRCA1/2 mutations, we are talking about the risk in those four years specifically (independently). It does not necessarily equate to the lifetime risk or cumulative risk of carriers of genetic mutations.

The study is of good methodological quality. It uses a very large training cohort (~397,000 women), an independent validation cohort in Australia (~96,000) and external validation in Sweden (~4,500), which reduces the risk of overfitting and improves generalisation. The statistical analyses are rigorous (univariate and multivariate models, AUC, odds ratios, threshold analyses, and evaluation of confounding factors). The conclusions (that the BRAIx score accurately predicts current cancer and four-year risk and outperforms traditional factors such as breast density or family history) are supported by statistically significant and consistent data in both the Australian and Swedish cohorts.

This study represents an advance over previous evidence. Previous research had explored AI algorithms for detecting cancer in images, but had limitations such as combining current detections with future prediction without adjusting for factors such as age, family history, or breast density. BRAIx not only detects, but also transforms the score of a detection algorithm into an intermediate risk score (up to four years), something that previous studies had not demonstrated so strongly. Furthermore, it outperforms other predictors such as polygenic scores and breast density classification in terms of predictive ability.

The authors considered several confounding factors:

They adjusted their models for multiple variables such as age, breast density, family history, and country of birth.
They analysed the independent effect of each factor via multivariate models and evaluated changes in the coefficients to determine their causal inference (determining the cause-effect relationship between variables).
They verified how the BRAIx score absorbs much of the predictive information from traditional factors such as breast density or family history.

Limitations:

Follow-up limited to four years: it is not known how the score behaves over longer periods.
Possible bias due to loss to follow-up: women who do not return to the programme or who are diagnosed outside the system could bias the results.
Generalisation: although there is external validation in Sweden, more diverse populations (other countries, ethnicities, health systems) need to be studied. No assessment of real clinical impact: the study does not measure whether using this score reduces mortality, improves medical decisions or changes biopsy or intervention rates.

The potential implications are very relevant, although not yet immediate.

The BRAIx score could allow for personalised screening instead of the current one-size-fits-all approach:

High-risk women could receive stricter follow-up or additional tests (MRI, contrast mammography).
Low-risk women could space out their check-ups.

This could translate into:

Better early detection.
Fewer false positives and false negatives.
Optimisation of healthcare resources.
Potential cost savings.
Less anxiety for patients.

However, before implementing it in general clinical practice, prospective studies are needed to assess the real impact on relevant outcomes (such as mortality, quality of life, cost-effectiveness, clinical outcomes).

Language EN