This article is based on deep learning theory and big data technology to build a model on how to analyse massive amounts of audio data and use it to provide better services. Firstly, the spectrograms and waveforms are visualised to initially analyse the audio features. Then, the MFCC and Chroma features of audio were extracted respectively, and the MLP model was built to classify the two features and trained separately. In order to make the audio recognition technique highly efficient, this paper also adopts the non-negative matrix decomposition method (NMF) to enhance the audio data, which makes the differentiation between different audio data more significant, and the accuracy of the MLP model built based on the reconstructed new audio data finally reaches 89.12%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.