Accommodating population differences in the validation of risk prediction models
Seminar Ruth Pfeiffer
Validation of risk prediction models in independent data provides a rigorous assessment of model performance. However, several differences between the populations that gave rise to the training and the validation data can lead to seemingly poor performance of a risk model. We formalize the notions of “similarity” of the training and validation data and define reproducibility and transportability. We address the impact of different predictor distributions and differences in verifying the outcome on model calibration, accuracy, and discrimination.
When individual level data from both the training and validation data sets are available, we propose and study weighted versions of the validation metrics that adjust for differences in the predictor distributions and in outcome verification to provide a more comprehensive assessment of model performance.
We give conditions on the model and the training and validation populations that ensure a model's reproducibility or transportability and show how to check them. We discuss approaches to recalibrate a model. As an illustration, we develop and validate a prostate cancer risk model using data from two large North American prostate cancer prevention trials, the SELECT and PLCO trials.
Speaker
Ruth Pfeiffer, Ph.D., Biostatistics Branch, National Cancer Institute, NIH, HHS, Bethesda, MD 20892-7244, USA
Host
Sander Roberti