This paper presents an exploration of Federated Learning (FL) in medical imaging, focusing on Computational Pathology (CP) with Whole Slide Images (WSIs) for head and neck cancer. While previous FL approaches in healthcare targeted radiology, genetics, and Electronic Health Records (EHRs), our research addresses the understudied area of CP datasets. Our aim is to develop robust AI models for CP datasets without sacrificing data privacy and security. To this end, we demonstrate the use of FL applied to a CP dataset consisting of papillary thyroid carcinoma, specifically focusing on the rare and aggressive variant called Tall Cell Morphology (TCM). Patients with TCM require more aggressive treatment and rigorous follow-up due to its aggressiveness and increased recurrence rates. In this work, we perform a simulated FL training experiment by dividing a dataset into three virtual ”clients”. We locally train a Convolutional Neural Network (CNN), to classify patches of tissue labelled from the local WSI dataset as “tall” (expressing TCM) or “non-tall”. Models are then aggregated and convergence is ensured through the Federated Averaging (FedAVG) algorithm. The decentralized approach of FL creates a secure and privacy-preserving collaborative training environment, keeping individual client data local through horizontal data partitioning. This enables collective training of deep learning models on distributed data, benefiting from a diverse and rich dataset while safeguarding patient privacy. We compare the efficacy of the FL-trained model to a centralized model (trained using all ”client” data together) using accuracy, sensitivity, specificity, and F-1 score. Our findings indicate that the simulated FL models exhibit performance on par with or superior to centralized learning, achieving accuracy scores between 75-87%, while centralized learning attains an accuracy of 82%. This novel approach holds promise for revolutionizing computational pathology and contributing to more effective medical decision-making.
Annotation of true ground truth is a difficult task for many computational pathology problems. Types of ground truth labels in the field include bounding boxes, text labels, binary class labels, and full tissue maps. The compounding issue is when multiple different pathologists label the same image, and there is disagreement between them. In this work, we investigate multiply reannotated tumor maps for squamous cell carcinoma, and if different annotation fusion methods have an impact on tumor segmentation. We find in this work that tumor label maps with an average annotation similarity of 0.759, do not have a significant quantitative difference in tumor segmentation.
Active Learning (AL) is an artificial intelligence (AI) training paradigm that improves training efficiency in cases where labeled training is hard to obtain. In AL, unlabeled samples are selected for annotation using a bootstrap classifier to identify samples whose informational content is not represented in the current training set. Given a small number of samples, this optimizes training by focusing annotation on “informative” samples. For computational pathology, identifying the most-informative samples is non-trivial, particularly for segmentation. In this work, we develop a feature-driven approach to identifying informative samples. We use a feature extraction pipeline operating on segmentation results to find “outlier” samples which are likely incorrectly segmented. This process allows us to automatically flag samples for re-annotation based on architecture of segmentation (compared with less robust confidence-based approaches). We apply this process to the problem of segmenting oral cavity cancer (OCC) H&E stained whole-slide images (WSIs), where the architecture of OCC tumor growth is an aggressive pathological indicator. Improving segmentation requires costly annotation of WSIs; thus, we seek to employ an AL approach to improve annotation efficiency. Our results show that, while outlier features alone are not sufficient to flag samples for re-annotation, we can identify some WSIs which fail segmentation.
In our previous work, we have demonstrated that it is possible to use a small bootstrap set of fully annotated regions of interest (ROIs) to generate segmentation results on the WSI scale. In this work, pathologists were asked to edit the previously generated annotations on 150 WSIs, focusing on only the tumor class. Of these re-annotated WSIs, 21 were then sampled from, and used to train a new version of the classifier. Segmentation results were then generated for the remainder of the images. This work demonstrates an improvement in segmentation of the tumor class.
Utilizing Artificial Intelligence (AI) generated tissue maps for outcome prediction would aid in reducing the exhaustive workload on pathologists. But how quantitatively analogous are these maps to pathologist labeled maps must be studied. Another area that interested us was to understand how the "satellite tumor" definition in tissue label maps affects the features extracted. Our work was motivated from these ideas. This work aids in understanding the impact on feature values extracted when an automatic relabeling is applied on both hand-annotated and AI tumor maps This would be a first step towards investigating if the AI maps can be reliable for recurrence risk prediction in early stage oral cavity cancer patients.
Deep learning for digital pathology is a challenging problem. Small patient datasets limit generalizability of trained deep learning models, while the large size of whole slide images (WSIs) represents a bottleneck for training. Additionally, annotations are difficult to obtain at scale due to image size and the volume of samples needed for accurate and generalizable training. We have investigated the use of Active Leaning (AL) to alleviate this burden; AL is a training approach where a small subset of samples is used to create a bootstrap classifier, which in turn selects new samples for annotation to maximize the performance gain for each additional training sample. In our previous work, we have found AL to be more efficient than the more common Random Learning (RL) approach in terms of segmentation performance per training sample. In the current work, we extend our investigation of AL by using our region-of-interest (ROI) trained classifier and perform WSI-level segmentation of multiple classes. We compare the results of the AL- to RL-based training, and generate inference results for a dataset of 75 WSIs spanning 61 patients. After four rounds of training, AL yielded a validation loss 0.566 lower as well as dice coefficients an average of 0.022 higher for classes present in images for the holdout testing set. This work demonstrates the generalizability of AL from patch-based segmentation to WSI-based, and provides a path forward for rapid development of complex digital pathology datasets in deep learning.
Recently in the field of digital pathology, there have been promising advances with regards to deep learning for pathological images. These methods are often considered “black boxes”, where tracing inputs to outputs and diagnosing errors is a difficult task. This is important as neural networks are fragile, and dataset variation, which in digital pathology is attributed to biological variance, can cause low accuracy. In deep learning, this is typically addressed by adding data to the training set. However, training is costly and time-consuming to create and may not address all variation seen in these images. Digitized histology carries a great deal of variation across many dimensions (color / stain variation, lighting intensity, presentation of a disease, etc.), and some of these “low-level” image variations may cause a deep network to break due to their fragility. In this work, we use a unique dataset – cases of serially-registered H and E tissue samples from oral cavity cancer (OCC) patients – to explore the errors of a classifier trained to identify and segment different tissue types. Registered serial sections allow us to eliminate variability due to biological structure and focus on image variability including staining and lighting, and try to identify sources of error that may cause deep learning to fail. We find that perceptually-insignificant changes in an image (minor lighting and color shifts) can result in extremely poor classification performance, even when the training process tries to prevent overfitting. This suggests that great care must be taken to augment and normalize datasets to prevent errors.
Patients diagnosed with early stage (Stage I/II) Oral Cavity Cancer (OCC) are typically treated with surgery alone. Unfortunately, 25-37% of early stage OCC patients experience loco-regional tumor recurrence after receiving surgery. Currently, pathologists use the Histologic Risk Model (HRM), a clinically validated risk assessment tool to determine patient prognosis. In this study, we perform image registration on two cases of serially sectioned blocks of Hematoxylin and Eosin (H and E) stained OCC tissue sections. The goal of this work is to create an optimized registration procedure to reconstruct 3D tissue models, which can provide a pathologist with a realistic representation of the tissue architecture before surgical resection. Our project aims to extend the HRM to enhance prediction performance for patients at high risk of disease progression using computational pathology tools. In previous literature, others have explored image registration of histological slides and reconstructing 3D models with similar processes used. Our work is unique in that we are investigating in-depth the parameter space of an image registration algorithm to establish a registration procedure for any serial histological section. Each parameter set was sequentially perturbed to determine the best parameter set for registration, as evaluated through mutual information.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.