Speech Enhancement Using Convolutional Denoising Auto-Encoder

Shahriyar, Shaikh Akib

KUET Institutional Repository Home
→
Faculty of Electrical and Electronic Engineering
→
Department of Computer Science & Engineering (CSE)
→
M.Sc. Engg.
→
View Item

dc.contributor.advisor	Akhand, Prof. Dr. Muhammad Aminul Haque
dc.contributor.author	Shahriyar, Shaikh Akib
dc.date.accessioned	2019-05-15T06:12:34Z
dc.date.available	2019-05-15T06:12:34Z
dc.date.copyright	2019
dc.date.issued	2019-05
dc.identifier.other	ID 1707556
dc.identifier.uri	http://hdl.handle.net/20.500.12228/516
dc.description	This thesis is submitted to the Department of Computer Science and Engineering, Khulna University of Engineering & Technology in partial fulfillment of the requirements for the degree of Master of Science in Computer Science and Engineering, May 2019.	en_US
dc.description	Cataloged from PDF Version of Thesis.
dc.description	Includes bibliographical references (pages 31-35).
dc.description.abstract	Speech signals are complex in nature with respect to other forms of communication media such as text or image. Different forms of noises (e.g., additive noise, channel noise, babble noise) interfere with the speech signals and drastically hamper the quality of the speech. Enhancement of speech signals is a daunting task considering multiple forms of noises while denoising a speech signal. Certain analog noise eliminator models have been studied over the years for this purpose. Researchers have also delved into some machine learning techniques (e.g., artificial neural network) to enhance speech signals. In this study, a speech enhancement system is investigated using Convolutional Denoising Autoencoder (CDAE). Convolutional neural network (CNN) is a special kind of deep neural networks which is suitable for 2D structured input (e.g., image) and CDAE is a CNN based special kind of Denoising Autoencoder. CDAE takes advantages from the 2D structured inputs of the features extracted from speech signals and also considers the local temporal relationship among the features. In the proposed system, CDAE is trained considering features from noisy speech signal as input and clean speech features as desired output. The proposed CDAE based method has been tested on a benchmark dataset, called Speech Command Dataset, and attained 80% similarity between denoised speech and actual clean speech. The proposed system achieved perceptual evaluation of speech quality (PESQ) value of 2.43 which outperformed other related existing methods.	en_US
dc.description.statementofresponsibility	Shaikh Akib Shahriyar
dc.format.extent	35 pages
dc.language.iso	en_US	en_US
dc.publisher	Khulna University of Engineering & Technology (KUET), Khulna, Bangladesh	en_US
dc.rights	Khulna University of Engineering & Technology (KUET) thesis/dissertation/internship reports are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission.
dc.subject	Speech Enhancement	en_US
dc.subject	Speech Cleaning	en_US
dc.subject	Convolutional Denoising Autoencoder (CDAE)	en_US
dc.subject	Mel Frequency Cepstral Coefficients (MFCC)	en_US
dc.subject	Clean Audio
dc.title	Speech Enhancement Using Convolutional Denoising Auto-Encoder	en_US
dc.type	Thesis	en_US
dc.description.degree	Master of Science in Computer Science and Engineering
dc.contributor.department	Department of Computer Science and Engineering