深層遞迴類神經網路之正規化及聲學模型之建立

完整後設資料紀錄

DC 欄位	值	語言
dc.contributor.author	盧采威	en_US
dc.contributor.author	Lu, Tsai-Wei	en_US
dc.contributor.author	簡仁宗	en_US
dc.contributor.author	Chien, Jen-Tzung	en_US
dc.date.accessioned	2014-12-12T02:42:45Z	-
dc.date.available	2014-12-12T02:42:45Z	-
dc.date.issued	2013	en_US
dc.identifier.uri	http://140.113.39.130/cdrfb3/record/nctu/#GT070160277	en_US
dc.identifier.uri	http://hdl.handle.net/11536/75203	-
dc.description.abstract	深層學習在不同分類評估系統已證實獲得極高的分類正確率，深層類神經網路已成為現今語音辨識領域中的熱門研究議題。本篇論文發展具新穎性之遞迴類神經網路(Recurrent Neural Network)之正規化(Regularization)並建立深層聲學模型(Acoustic Model)於雜訊語音辨識系統。我們的方法是在深層類神經網路的預訓練 (Pre-training)過程中加入提克洛夫正規化(Tikhonov regularization)。此想法是藉由補償類神經網路系統受輸入語音資料變異的影響，使系統效能較具強健性，尤其在以限制型波茲曼模型(Restricted Boltzmann Machine)的預訓練過程，我們進行特徵學習及深層聲學模型訓練，透過提克洛夫正規化建立起若干模型不變性(Invariance)之特性。在限制型波茲曼模型中，我們更結合以權重衰減(Weight Decay)為主的正規化法則，使用這種正規化的結合機制可以有效增加在交替式訓練馬可夫鏈(Gibbs Markov Chain)的混合率並使對比散度(Contrastive Divergence)更接近最大相似度(Maximum Likelihood)學習。另外，我們也提出將倒傳遞累積時間法(Backpropagation Through Time, BPTT)延伸應用在遞迴類神經網路中遞迴參數及隱藏層與遞迴層間參數的模型訓練。在實驗評估中，我們使用卡爾迪(Kaldi)深層類神經網路語音辨識軟體實現本論文提出的演算法，在Resource Management及Aurora4語音資料庫的實驗結果發現，雙重正規化(Hybrid Regularization)法及倒傳遞累積時間法(BPTT)的確可以提升深層類神經網路聲學模型之強健性及其語音辨識率。	zh_TW
dc.description.abstract	Deep learning has been widely demonstrated to achieve high performance in many classification tasks. Deep neural network is now a new trend in the areas of automatic speech recognition. In this dissertation, we deal with the issue of model regularization in deep recurrent neural network and develop the deep acoustic models for speech recognition in noisy environments. Our idea is to compensate the variations of input speech data in the restricted Boltzmann machine (RBM) which is applied as a pre-training stage for feature learning and acoustic modeling. We implement the Tikhonov regularization in pre-training procedure and build the invariance properties in acoustic neural network model. The regularization based on weight decay is further combined with Tikhonov regularization to increase the mixing rate of the alternating Gibbs Markov chain so that the contrastive divergence training tends to approximate the maximum likelihood learning. In addition, the backpropagation through time (BPTT) algorithm is developed in modified truncated minibatch training for recurrent neural network. This algorithm is not implemented in the recurrent weights but also in the weights between previous layer and recurrent layer. In the experiments, we carry out the proposed methods using the open-source Kaldi toolkit. The experimental results using the speech corpora of Resource Management (RM) and Aurora4 show that the ideas of hybrid regularization and BPTT training do improve the performance of deep neural network acoustic model for robust speech recognition.	en_US
dc.language.iso	en_US	en_US
dc.subject	模型正規化	zh_TW
dc.subject	深層學習	zh_TW
dc.subject	遞迴類神經網路	zh_TW
dc.subject	聲學模型	zh_TW
dc.subject	語音辨識	zh_TW
dc.subject	Tikhonov regularization	en_US
dc.subject	deep learning	en_US
dc.subject	recurrent neural network	en_US
dc.subject	acoustic model	en_US
dc.subject	speech recognition	en_US
dc.title	深層遞迴類神經網路之正規化及聲學模型之建立	zh_TW
dc.title	Tikhonov regularization for deep recurrent neural network acoustic modeling	en_US
dc.type	Thesis	en_US
dc.contributor.department	電信工程研究所	zh_TW
顯示於類別：	畢業論文