An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

Nasersharif, B.; Naderi, N.

doi:10.22068/IJEEE.17.2.1563

Volume 17, Issue 2 (June 2021) IJEEE 2021, 17(2): 1563-1563 | Back to browse issues page

‎ 10.22068/IJEEE.17.2.1563

‎ 20.1001.1.17352827.2021.17.2.3.0

Mendeley

Zotero

RefWorks

Nasersharif B, Naderi N. An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition. IJEEE 2021; 17 (2) :1563-1563
URL: http://ijeee.iust.ac.ir/article-1-1563-en.html

An Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition

B. Nasersharif

, N. Naderi

Abstract: (3230 Views)

Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck features extracted by CBNs contain discriminative and rich context information. In this paper, we discuss these bottleneck features from an information theory viewpoint and use them as robust features for noisy speech recognition. In the proposed method, CBN inputs are the noisy logarithm of Mel filter bank energies (LMFBs) in a number of neighbor frames and its outputs are corresponding phone labels. In such a system, we showed that the mutual information between the bottleneck layer and labels are higher than the mutual information between noisy input features and labels. Thus, the bottleneck features are a denoised compressed form of input features which are more representative than input features for discriminating phone classes. Experimental results on the Aurora2 database show that bottleneck features extracted by CBN outperform some conventional speech features and also robust features extracted by CNN.

Keywords: Bottleneck Features , Convolutional Neural Network , Convolutive Bottleneck Network , Mutual Information , Robust Speech Recognition

Full-Text [PDF 1292 kb] (1269 Downloads)

We Use convolutional bottleneck features for robust speech recondition.
We analyze the mutual information between input and each layer, and show that bottleneck layer has a high information about network input.
We show that convolutional bottleneck network training performance is related to the mutual information of bottleneck features and class labels.

Type of Study: Research Paper | Subject: Image Processing
Received: 2019/08/12 | Revised: 2020/10/10 | Accepted: 2020/10/17

Rights and permissions
	This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

© 2022 by the authors. Licensee IUST, Tehran, Iran. This is an open access journal distributed under the terms and conditions of the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license.

Iranian Journal of Electrical and Electronic Engineering

Iran University of Science and Technology

Aims & Scopes

Related Websites