标题: | 在感知讯号上使用子空间分析之语音增强技术 Subspace Decomposition of Perceptual Representations for Speech Enhancement |
作者: | 萧任伯 Hsiao, Jen-Po 冀泰石 Chi, Tai-Shih 电信工程研究所 |
关键字: | 语音增强;感知;子空间;语音辨识;speech enhancement;perceptual;subspace;ACC;HTK |
公开日期: | 2009 |
摘要: | 在早期的语音讯号处理,是从时域或频域两种不同维度分开处理。近年来随着听觉模型的建立,我们确认了人类在听觉上是同时在时、频两的维度上处理,基于这样高维度的分析,人类比之现存的任何演算法拥有更高的健全性。 本论文中,使用了马里兰大学NSL(Neural Systems Laboratory)实验室所开发出来的听觉感知模型,模拟讯号透过耳朵往上传递到中脑听神经的传递路径,在其时-频域分析阶段先滤出语音最显着的区域,接着使用子空间分析进一步压抑残存之杂讯。最后利用听觉模型抽取出的语音特征参数(Auditory Spectrogram Coefficients)在隐藏式马可夫模型套件(HTK)上做连续数字的语音辨识,由辨识率的提升来印证此演算法的强健性。 In early years, conventional speech enhancement techniques have been developed separately in time domain and in frequency domain. Recent years, with the auditory model being introduced, enhancement techniques are developed in joint spectro-temporal domains to incorporate hearing perception perspectives to enhance their robustness. In this thesis, we use the auditory model, which simulates the hearing physiology from cochlea to cortex, introduced by NSL(Neural Systems Laboratory), Maryland university. At first, the spectrograms are selected within speech regions in cortical domain. Second, we adopt the subspace algorithm to filter the noise that exists in speech regions. Finally, the Auditory Cepstrum Coefficients (ACC) is extracted for HTK recognition task. From HTK evaluations, the robustness of the proposed algorithm is proven. |
URI: | http://140.113.39.130/cdrfb3/record/nctu/#GT079613550 http://hdl.handle.net/11536/41986 |
显示于类别: | Thesis |
文件中的档案:
If it is a zip file, please download the file and unzip it, then open index.html in a browser to view the full text content.