Gunshot recordings have the potential for both tactical detection and forensic evaluation particularly to ascertain
information about the type of firearm and ammunition used. Perhaps the most significant challenge to such an analysis is
the effect of recording conditions on the audio signature of recorded data. In this paper we present a first study of using
an exemplar embedding approach to automatically detect and classify firearm type across different recording conditions.
We demonstrate that a small number of exemplars can span the space of gunshot audio signatures and that this optimal
set can be obtained using a wrapper function. By projecting a given gunshot to the subspace spanned by the exemplar set
a distance measure/feature vector is obtained that enables comparisons across recording conditions. We also investigate
the use of a hierarchy of gunshot classifications that assists in improving finer level classification by pruning out gunshot
labeling that is inconsistent with its higher level type. The embedding based approach can thus be used both by itself and
as a pruning stage for other search techniques.
Our dataset includes 20 different gun types captured in a number of different conditions. This data acts as our original
exemplar set. The dataset also includes 12 gun types each with multiple shots recorded in the same conditions as the
exemplar set. This second set provides our training and testing sets. We show that we can reduce our exemplar space
from 20 to only 4 uniquely different gunshots without significantly limiting the ability of our embedding approach to
discriminate different gunshots in the training and testing sets. The basic hypothesis in the embedding approach is that
the relationship between the set of exemplars and space of gunshots including the testing/training set would be robust to
a change in recording conditions or the environment. That is to say the embedding distance between a particular gunshot
and the exemplars would tend to remain the same in changing environments. The implication of this are two-fold; first,
unlike other dimensionality reduction approaches we have access to particular instances/examples of entities (the
exemplars), which act as bridges to connect different recording conditions. Second, the embedding distances are
invariant across recording conditions, the embedded vector can be used as a feature of similarity between gunshots
recorded in different conditions.
Unlike other dimensionality reduction approaches , our approach generates descriptions that are always in terms of the
same exemplars. In other approaches such as PCA, the data driven nature makes it difficult if not impossible to make
correspondence in the dimensions in one space to another.
We have shown that gunshot classification across different recording conditions can be performed at a reasonable degree
of certainty (60-72%) at a finer level (gunshot to weapon model) and at a high degree of certainty (95-100%) at a
higher degree of abstraction (gunshot to `handgun' or `rifle'). We also investigate the use of simulated recording
conditions and artificial noise to quantitatively evaluate the performance of our approach.
|