In general, deep networks are biased by the truth data provided to the network in training. Many recent studies are focused on understanding and avoiding biases in deep networks so that they can be corrected in future predictions. Particularly, as deep networks experience increased implementation, it is important that biases are explored to understand where predictions can fail. One potential source of bias is in the truth data provided to the network. For example, if a training set consists of only white males, it is likely that predictive performance will be improved on a testing set of white males than a testing set of African-American females. The U-Net architecture is a deep network that has seen widespread use over recent years, particularly for medical imaging segmentation tasks. The network is trained using a binary mask delineating the object to be segmented, which is typically produced using manual or semi-automated methods. It is possible for the manual/semi-automated method to yield biased truth, thus, the purpose of our study is to evaluate the impact of varying truth data as provided by two different observers on U-Net segmentation performance. Additionally, a common problem in medical imaging research is a lack of data, forcing many studies to be performed with insufficient datasets. However, the U-Net has been shown to achieve sufficient segmentation performance on small training set sizes, thus we also investigate the impact of training set size on U-Net performance for a simple segmentation task in low-dose thoracic CT scans. This is also conducted to support that the results produced in the observer variability section of this study are not caused by lack of sufficient training data.
|