Towards safe and reliable deep learning for lung nodule malignancy estimation using out-of-distribution detection

D. Peeters, K. Venkadesh, R. Dinnessen, Z. Saghir, E. Scholten, R. Vliegenthart, M. Prokop and C. Jacobs

Computers in Biology and Medicine 2025;186:109633.

DOI

Artificial Intelligence (AI) models may fail or suffer from reduced performance when applied to unseen data that differs from the training data distribution, referred to as dataset shift. Automatic detection of out-of-distribution (OOD) data contributes to safe and reliable clinical implementation of AI models. In this study, we propose a recognized OOD detection method that utilizes the Mahalanobis distance (MD) and compare its performance to widely known classical methods. The MD measures the similarity between features of an unseen sample and the distribution of development samples features of intermediate model layers. We integrate our proposed method in an existing deep learning (DL) model for lung nodule malignancy risk estimation on chest CT and validate it across four dataset shifts known to reduce AI model performance. The results show that our proposed method outperforms the classical methods and can effectively detect near- and far-OOD samples across all datasets with different data distribution shifts. Additionally, we demonstrate that our proposed method can seamlessly incorporate additional In-distribution (ID) data while maintaining the ability to accurately differentiate between the remaining OOD cases. Lastly, we searched for the optimal OOD threshold in the OOD dataset where the performance of the DL model stays reliable, however no decline in DL performance was revealed as the OOD score increased.