4D time-frequency representation for binaural speech signal processing

Raed Mikhael; Harold H. Szu

doi:10.1117/12.668690

17 April 2006 4D time-frequency representation for binaural speech signal processing

Raed Mikhael, Harold H. Szu

Proceedings Volume 6247, Independent Component Analyses, Wavelets, Unsupervised Smart Sensors, and Neural Networks IV; 62470V (2006) https://doi.org/10.1117/12.668690
Event: Defense and Security Symposium, 2006, Orlando (Kissimmee), Florida, United States

Abstract

Hearing is the ability to detect and process auditory information produced by the vibrating hair cilia residing in the corti of the ears to the auditory cortex of the brain via the auditory nerve. The primary and secondary corti of the brain interact with one another to distinguish and correlate the received information by distinguishing the varying spectrum of arriving frequencies. Binaural hearing is nature's way of employing the power inherent in working in pairs to process information, enhance sound perception, and reduce undesired noise. One ear might play a prominent role in sound recognition, while the other reinforces their perceived mutual information. Developing binaural hearing aid devices can be crucial in emulating the working powers of two ears and may be a step closer to significantly alleviating hearing loss of the inner ear. This can be accomplished by combining current speech research to already existing technologies such as RF communication between PDAs and Bluetooth. Ear Level Instrument (ELI) developed by Micro-tech Hearing Instruments and Starkey Laboratories is a good example of a digital bi-directional signal communicating between a PDA/mobile phone and Bluetooth. The agreement and disagreement of arriving auditory information to the Bluetooth device can be classified as sound and noise, respectively. Finding common features of arriving sound using a four coordinate system for sound analysis (four dimensional time-frequency representation), noise can be greatly reduced and hearing aids would become more efficient. Techniques developed by Szu within an Artificial Neural Network (ANN), Blind Source Separation (BSS), Adaptive Wavelets Transform (AWT), and Independent Component Analysis (ICA) hold many possibilities to the improvement of acoustic segmentation of phoneme, all of which will be discussed in this paper. Transmitted and perceived acoustic speech signal will improve, as the binaural hearing aid will emulate two ears in sound localization, speech understanding in noisy environment, and loudness differentiation.

Citation Download Citation

Raed Mikhael and Harold H. Szu "4D time-frequency representation for binaural speech signal processing", Proc. SPIE 6247, Independent Component Analyses, Wavelets, Unsupervised Smart Sensors, and Neural Networks IV, 62470V (17 April 2006); https://doi.org/10.1117/12.668690

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available

Members: $17.00

Non-members: $21.00 ADD TO CART

PROCEEDINGS
11 PAGES

DOWNLOAD PAPER SAVE TO MY LIBRARY

GET CITATION

RIGHTS & PERMISSIONS

Get copyright permission Get copyright permission on Copyright Marketplace

KEYWORDS

Ear

Signal processing

Time-frequency analysis

Wavelets

Independent component analysis

Personal digital assistants

Acoustics

Show All Keywords

Keywords/Phrases

Search In:

Publication Years