Proceedings Article | 13 March 2019
Eric Wu, Lubomir Hadjiiski, Ravi Samala, Heang-Ping Chan, Kenny Cha, Caleb Richter, Richard Cohan, Elaine Caoili, Chintana Paramagul, Ajjai Alva, Alon Weizer
KEYWORDS: Bladder cancer, Computed tomography, Cancer, Image segmentation, Convolution, Bladder, Oncology, Performance modeling, Tumor growth modeling, Toxicity
We compared the performance of different Deep Learning - Convolutional Neural Network (DL-CNN) models for bladder cancer treatment response assessment based on transfer learning by freezing different DL-CNN layers and variation of the DL-CNN structure. Pre- and post-treatment CT scans of 123 patients (129 cancers, 158 pre- and posttreatment cancer pairs) undergoing chemotherapy were collected. 33% of patients had T0 stage cancer (complete response) after chemotherapy. Regions of interest (ROIs) of pre- and post-treatment scans were extracted from the segmented lesions and combined into hybrid pre-post image pairs. The dataset was split into training (94 pairs and 6209 hybrid ROIs), validation (10 pairs) and test sets (54 pairs). The DL-CNN consists of 2 convolution (C1, C2), 2 locally connected (L1, L2), and 1 fully connected layers, implemented in TensorFlow. The DL-CNN was trained to classify the bladder cancers as fully responding (stage T0) or not fully responding to chemotherapy based on the hybrid ROIs. Two blinded radiologists provided an estimate of the likelihood of the lesion being stage T0 post-treatment by reading the pairs of pre- and post-treatment CT volumes. The test AUC was 0.73 for T0 prediction by the base DL-CNN structure with randomly initialized weights. The base DL-CNN structure with transfer learning pre-trained weights (no frozen layers) achieved a test AUC of 0.79. The test AUCs for 3 modified DL-CNN structures (different C1, C2 max pooling filter sizes, strides, and padding, with transfer learning) were 0.72, 0.86, and 0.69, respectively. For the base DL-CNN with (C1) frozen, (C1, C2) frozen, and (C1, C2, L3) frozen during transfer learning, the test AUCs were 0.81, 0.78, and 0.71, respectively. The radiologists’ AUCs were 0.76 and 0.77. The DL-CNN performed better with pre-trained than randomly initialized weights.