Identifying pathology in medical imaging data is a crucial step for patient diagnosis, treatment and prognosis. Deep learning, particularly convolutional neural networks, has led to breakthroughs in computer-aided diagnosis and detection. Nonetheless, these methods are heavily dependent on large number of training samples, which is not often available in medical imaging field. Moreover, while state-of-art supervised segmentation methods rely on precise voxel-wise annotations, manual lesion delineation in medical images is extremely laborious and time consuming task. Recent advancements in the field of generative adversarial networks (GAN) show promising results in generating realistic data samples for the purpose of augmenting datasets for downstream tasks, however the quality of samples generated by GANs also depends on the variability and size of the training set, particularly for large images. Unlike the majority of recent GAN methods, which focus on generation of either unlabeled samples or data restricted to particular classes, we propose a framework for controllable pathological image synthesis. Our approach is inspired by CycleGAN, where instead of generating images from random noise, we perform cycle-consistent image-to-image translation between two domains: healthy and pathological. Guided by a semantic map, an adversarially trained generator synthesizes pathology on a healthy image in the specified location. We demonstrate our approach in two distinct applications: a public dataset for brain tumors segmentation (BraTS2018) and an institutional dataset of cerebral microbleeds in traumatic brain injury patients. We subsequently utilize synthetic images generated with our method for data augmentation for the detection of cerebral microbleeds. Enriching the training dataset with synthetic images produced by our method exhibits the potential to increase sensitivity of cerebral microbleeds in traumatic brain injury detection system. The model trained only on real samples achieves an average sensitivity of 88% at 20 false positives per patient, after augmenting the training set with synthetic samples the model achieves an average sensitivity of 92% at the same rate of false positives per patient.
Brain MRI synthesis via pathology factorization and adversarial cycle-consistent learning for data augmentation
K. Faryna
Master thesis 2020.