Uncertainty Quantification in Medicine Science: The Next Big Step

Hammouri, Ziad Akram Ali; Mier, Pablo Rodríguez; Félix, Paulo; Mansournia, Mohammad Ali; Huelin, Fernando; Casals, Martí; Matabuena, Marcos

doi:10.1016/j.arbres.2023.07.018

Archivos de Bronconeumología

ISSN: 0300-2896

Archivos de Bronconeumologia is an international journal that publishes original studies whose content is based upon results of research initiatives dealing with several aspects of respiratory medicine including epidemiology, respiratory physiology, pathophysiology of respiratory diseases, clinical management, thoracic surgery, pediatric lung diseases, respiratory critical care, respiratory allergy and translational research. Other types of articles such as editorials, reviews, and different types of letters are also published in the journal. Additionally, the journal expresses the voice of the following scientific societies: the Spanish Respiratory Society of Pneumology and Thoracic Surgery (SEPAR; https://www.separ.es/), the Latin American Thoracic Society (ALAT; https://alatorax.org/), and the Iberian American Association of Thoracic Surgery (AIACT; http://www.aiatorax.com/).

It is a monthly journal in which all manuscripts are sent to peer-review and handled by the editor or an associate editor from the team and the final decision is made on the basis of the comments from the expert reviewers and the editors. The journal is published solely in English. All the published data is composed of novel manuscripts not previously published in any other journal and not being in consideration for publication in any other journal..

The journal is indexed at Science Citation Index Expanded, Medline/Pubmed, Embase and SCOPUS. Access to any published article is possible through the journal's web page as well as from Pubmed, ScienceDirect, and other international databases. Furthermore, the journal is also present in X, Facebook and Linkedin. Manuscripts can be submitted electronically using the following web site: https://www.editorialmanager.com/ARBR/.

Indexed in:

Medline, Science Citation Index Expanded (SCIE)

Current technological progress opens the door to measuring ever more biological processes and identifying bio-mechanism patterns at an unprecedented level of resolution. Faced with the ensuing avalanche of information, the promises of digital, precision and personalized medicine are increasingly fulfilled,1 and their success largely depends on the intensive use of data-driven approaches, particularly those stemming from the confluence of statistics and machine learning research. However, nowadays the application of statistics and machine learning in many modeling tasks, from specific risk injury prediction to general physiology,2 remains limited and constitutes a topic of great controversy that has divided experts about their ability to obtain useful results.3–7

A critical point which has gone largely unnoticed in the medicine field is the need to quantify the predictive limits of the models through uncertainty analysis. Practitioners and researchers tend to focus on performing inferential analysis to measure the uncertainty regarding the conditional mean (the expected value for a particular characteristic given that a certain set of conditions is known to occur), which presents important limited practical interpretations as the sample size grows.8 However, they usually ignore the conditional distribution, which is more informative in practice and helps to establish the reliability of the predictive results obtained from different sets of values for model inputs.

In our opinion, it would be advisable to provide, as the output of any predictive model, not only the point estimates, for example, of the conditional mean, but also a prediction band estimated by using the heteroscedastic signal of the predictive model. This prediction band is built as an estimate of the interval in which a future observation will fall, with a certain probability, given what has already been observed. We exemplify it from the results of our previous study on the submaximal prediction of maximal oxygen uptake in a general population.9 Following previous functional regression models,10 this time we quantify uncertainty using the conformal inference framework,11,12 which ensures to construct valid prediction bands with respect to the coverage error.13

Fig. 1 (left) shows the prediction intervals for these individuals. In general, we observe a significant heterogeneity, indicating that the signal noise is heteroscedastic, and the uncertainty is high in the prediction results. This phenomenon happened despite the global predictive ability of the reported models, being reasonable in terms of the coefficient of determination (R2=0.80). In a more specific analysis, we try to predict with generalized additive models (GAM) the factors on which the radius of the prediction intervals depends. Let us recall that GAM is a generalized linear model in which the response variable depends linearly on unknown smooth functions of some predictor variables.14 After testing the linearity assumption, age is shown to exhibit a linear dependence, and with the increase of one year of age, the radius increases by 0.05, meaning that 40-year-old individuals versus 20-year-old individuals increase their radius in one point. In terms of peak oxygen (Fig. 1 (center)), we observe that when peak oxygen increases the uncertainty decreases, and a similar behavior can be observed when the weight increases (Fig. 1 (right)) in the range of 50–70kg. Importantly, the R2 of this model is equal to 0.16. According to this regression model, the uncertainty is heterogeneous and varies according to the level and physical conditions of the individuals. Ultimately, in those individuals with high-uncertainty values, the model's usefulness may be questioned.

Fig. 1.

(Left) Prediction interval of maximum peak oxygen with conformal inference techniques in all subjects. (Center) Additive effect of a GAM concerning oxygen consumption in the prediction of the radius of the prediction interval. (Right) Additive effect of a GAM concerning weight in the prediction of the radius of the prediction interval. Age, maximum peak oxygen and weight are statistically significant variables in the GAM model. Sex is not a statistically significant variable. The R-squared of the model is equal to 0.16. We use a linear link and a gaussian distribution as random error.

On the other hand, a closer inspection of the level of uncertainty in the prediction results can provide a crucial information for clinical management. As we have highlighted in a previous application in modeling five-year glucose changes in the general population, a phenotypically characterization of those subpopulations for which the model provides an unreliable prediction may be used to guide a specific approach, built on new assumptions, measurement procedures and interventions.1,2,9 Particularly, we showed that long-term changes in glycated hemoglobin cannot be adequately predicted for individuals with elevated fast plasma glucose (FPG) levels, and the same holds true for individuals with FPG levels in the normoglycemic range and overweight. For these specific subpopulations we can then suggest a more personalized follow-up.

In conclusion, while the current era of statistics and machine learning holds great promise and excitement, it faces significant challenges in various scientific domains, particularly in the biomedical field, due to the inherent high variability in predictive tasks. However, it is crucial not to succumb to pessimism. We advocate for a stronger emphasis on uncertainty quantification as a crucial driver for improving predictive models.

A comprehensive evaluation of uncertainty levels in predictions and the subsequent characterization of the reliability conditions of these models should serve as a valuable guide to identify additional requirements for measurement, particularly in the context of personalized follow-up where the risk of diseases and uncertainty are both high. Recently, there has been ongoing debate surrounding the communication of uncertainty and its varying impacts on individuals and different formats.15

Ultimately, the next significant stride in predictive models within medicine relies not solely on a simplistic competition based on accuracy measurements,16 but rather on the potential insights that lie within the realm of uncertainty. The COVID-19 pandemic confirms this fact, highlighting the limitations of simplistic utilization of systematic statistical models,17,18 and presenting opportunities for uncertainty quantification in different contexts. For example, in the case of seroprevalence estimations in Spain, a new stochastic conformal simulation model provides a more comprehensive understanding of pandemic control.19 Exciting new knowledge awaits discovery within this uncertain landscape, and the revolution of this statistical modeling is already underway.

Authors’ contributions

MM and MC drafted the article. All authors revised the article in depth and approved the submission to Medical Education.

Ethics approval

No human subjects were included for the purpose of the present work; therefore, no ethics approval was needed.

Funding

None.

Conflict of interests

None declared.

References

[1]

M. Matabuena, F.J. Salgado, J.J. Nieto-Fontarigo, M.J. Álvarez-Puebla, E. Arismendi, P. Barranco, et al.

Identification of asthma phenotypes in the Spanish MEGA cohort study using cluster analysis.

Arch Bronconeumol, 59 (2023), pp. 223-231

http://dx.doi.org/10.1016/j.arbres.2023.01.007 | Medline

[2]

M. Matabuena, P.R. Hayes, L. Puente-Maestu.

Prediction of maximal oxygen uptake from submaximal exercise testing in chronic respiratory patients. New perspectives.

Arch Bronconeumol, 55 (2019), pp. 507-508

http://dx.doi.org/10.1016/j.arbres.2018.12.008 | Medline

[3]

N.F. Bittencourt, W.H. Meeuwisse, L.D. Mendonça, et al.

Complex systems approach for sports injuries: moving from risk factor identification to injury pattern recognition—narrative review and new concept.

Br J Sports Med, 50 (2016), pp. 1309-1314

http://dx.doi.org/10.1136/bjsports-2015-095850 | Medline

[4]

G.S. Bullock, T. Hughes, J.C. Sergeant, et al.

Methods matter: clinical prediction models will benefit sports medicine practice, but only if they are properly developed and validated.

Br J Sports Med, 55 (2021), pp. 1319-1321

http://dx.doi.org/10.1136/bjsports-2021-104329 | Medline

[5]

G.S. Bullock, T. Hughes, A.H. Arundale, et al.

Black box prediction methods in sports medicine deserve a red card for reckless practice: a change of tactics is needed to advance athlete care.

Sports Med, 52 (2022), pp. 1729-1735

http://dx.doi.org/10.1007/s40279-022-01655-6 | Medline

[6]

A. McCall, M. Fanchini, A.J. Coutts.

Prediction: the modern-day sport-science and sports-medicine “quest for the holy grail”.

Int J Sports Physiol Perform, 12 (2017), pp. 704-706

[7]

R.O. Nielsen, I. Shrier, M. Casals, et al.

Statement on methods in sport injury research from the 1st methods matter meeting, Copenhagen, 2019.

Br J Sports Med, 54 (2020), pp. 941

http://dx.doi.org/10.1136/bjsports-2019-101323 | Medline

[8]

N. Altman, M. Krzywinski.

Predicting with confidence and tolerance.

Nat Methods, 15 (2018), pp. 843-845

http://dx.doi.org/10.1038/s41592-018-0196-7 | Medline

[9]

M. Matabuena, P. Félix, C. García-Meixide, et al.

Kernel machine learning methods to handle missing responses with complex predictors. Application in modelling five-year glucose changes using distributional representations.

Comput Methods Programs Biomed, 221 (2022), pp. 106905

http://dx.doi.org/10.1016/j.cmpb.2022.106905 | Medline

[10]

M. Matabuena, J.C. Vidal, P.R. Hayes, et al.

A 6-minute sub-maximal run test to predict VO2 max.

J Sports Sci, 36 (2018), pp. 2531-2536

http://dx.doi.org/10.1080/02640414.2018.1468149 | Medline

[11]

V. Vovk, A. Gammerman, G. Shafer.

Algorithmic learning in a random world.

Springer, (2005),

[12]