Annotation-efficient cancer detection with report-guided lesion annotation for deep learning-based prostate cancer detection in bpMRI

J.S. Bosma, A. Saha, M. Hosseinzadeh, I. Slootweg, M. de Rooij and H. Huisman

arXiv:2112.05151 2021.

DOI arXiv Cited by ~8

Deep learning-based diagnostic performance increases with more annotated data, but large-scale manual annotations are expensive and labour-intensive. Experts evaluate diagnostic images during clinical routine, and write their findings in reports. Leveraging unlabelled exams paired with clinical reports could overcome the manual labelling bottleneck. We hypothesise that detection models can be trained semi-supervised with automatic annotations generated using model predictions, guided by sparse information from clinical reports. To demonstrate efficacy, we train clinically significant prostate cancer (csPCa) segmentation models, where automatic annotations are guided by the number of clinically significant findings in the radiology reports. We included 7,756 prostate MRI examinations, of which 3,050 were manually annotated. We evaluated prostate cancer detection performance on 300 exams from an external centre with histopathology-confirmed ground truth. Semi-supervised training improved patient-based diagnostic area under the receiver operating characteristic curve from $87.2 \pm 0.8\%$ to $89.4 \pm 1.0\%$ ($P<10^-4$) and improved lesion-based sensitivity at one false positive per case from $76.4 \pm 3.8\%$ to $83.6 \pm 2.3\%$ ($P<10^-4$). Semi-supervised training was 14$\times$ more annotation-efficient for case-based performance and 6$\times$ more annotation-efficient for lesion-based performance. This improved performance demonstrates the feasibility of our training procedure. Source code is publicly available at github.com/DIAGNijmegen/Report-Guided-Annotation. Best csPCa detection algorithm is available at grand-challenge.org/algorithms/bpmri-cspca-detection-report-guided-annotations/.