Towards the goal of developing an informed, intuitive, and generalized artificial intelligence model for the early-stage diagnosis of Colorectal Cancer (CRC), in this work, we present a generative model-based technique to improve the training and generalization performance of machine learning classification algorithms. Through this approach, we address the challenge of acquiring sizable and well-balanced datasets within the clinical domain. Our methodology involves training generative models on already available medical data, learning the latent representations, and finally generating new synthetic samples to be used for downstream tasks. We train dedicated UNet2D-based Denoising Diffusion Probabilistic Models (DDPMs) using our custom dataset, which consists of textural images captured by our novel Vision-based Tactile Sensor (VS-TS), called Hysense. These UNet2D DDPMs are employed to generate synthetic images for each potential class. To thoroughly study the effectiveness of using synthetic images during training, we compared the performance of multiple classification models, ranging from simple to state-of-the-art approaches, with our evaluation focusing solely on real images. Specifically for our dataset, we also extend the use of dedicated UNet2D DDPMs to generate synthetic images of not just possible classes, but also other features that may be present in the image, such as whole or partial contact of sensor with polyp phantoms. Through our experimental analyses, we demonstrated that the utilization of generative models to enrich existing datasets with synthetic images leads to improved classification performance and a reduction in model biases.
|