The interpretability of an image indicates its potential information value. Historically, the National Imagery Interpretability Rating Scale (NIIRS) has been the standard for quantifying the interpretability of an image. With the growing reliance on machine learning (ML) for image analysis, NIIRS fails to capture the image quality attributes relevant to ML. Empirical studies have demonstrated that the relationship between NIIRS and ML performance is weak at best. In this study, we explore several image characteristics through the relationship between the training data and the test data using two standard ML methods: TensorFlow and Detectron2. We employed quantitative methods to measure color diversity, edge density, and image texture as ways to characterize the training and test sets. A series of experiments demonstrate the utility of these measures. The results suggest that each of the proposed methods quantifies an aspect of image difficulty for the ML method. Performance is generally better for test sets with lower levels of color diversity, edge density, and texture. In addition, the experiments suggest that training on higher complexity imagery yields more resilient models. Future studies will assess the relationship among these image features and explore methods for extending them.
|