Proceedings Article | 16 March 2020
KEYWORDS: Data modeling, Lung cancer, Tumor growth modeling, Cancer, Tumors, Lung, Principal component analysis, Machine learning, Oncology, Computed tomography
Lung cancer is a fatal disease, non-small cell lung cancer (NSCLC) being the most prevalent type. One of the main purposes of researching NSCLC is identifying patients at high risk for recurrence after surgical resection so that specific and suitable treatments can be found for them. The classification of cancer by anatomic disease extent, that is, by tumor-size (T stage) and nodal-involvement (N stage), is the most widely accepted determinant of appropriate treatment and prognosis among practicing clinicians. However, TN stage-based risk prediction can be inaccurate, as there is moderate observer variability when reporting the size of the lesion. Here, we propose a lung cancer–recurrence prediction model using principal component analysis (PCA) and machine learning (ML) techniques and considering radiomic features and clinical data, including the TN stage. After being filtered by a statistical model, the principal components, including Tand N-stage data and the handcrafted radiomic features from CT images, were applied to various ML models (i.e., random forests, support vector machines, naive Bayesian classifiers, and both boosting). We conducted this study, not only on recurrence, but also recurrence within two years of surgical resection, since more than 80% of recurrence occurs within this time frame. In both cases, the experimental results showed that combining radiomic features and clinical data improves the prediction of lung-cancer recurrence over that of models that only use TN stage data in terms of the 5-fold cross-validation accuracy mean, the receiver operating characteristic (ROC), the area under the ROC curve (AUC), and Kaplan-Meier curves. Finally, this model has been embedded in a website and is being prepared for the Ministry of Food and Drug Safety (MFDS) medical device registration and approval in South Korea.