Statistical Parametric Speech Synthesis Based on Gaussian Process Regression

Tomoki Koriyama; Takashi Nose; Takao Kobayashi

doi:http://dx.doi.org/10.1109/JSTSP.2013.2283461

論文・著書情報

タイトル

和文:
英文:	Statistical Parametric Speech Synthesis Based on Gaussian Process Regression

著者

和文:	郡山知樹, 能勢隆, 小林隆夫.
英文:	Tomoki Koriyama, Takashi Nose, Takao Kobayashi.

言語

English

掲載誌/書名

和文:
英文:	IEEE Journal of Selected Topics in Signal Processing

巻, 号, ページ

Vol. 8 No. 2 pp. 173-183

出版年月

2014年4月

出版者

和文:
英文:

会議名称

和文:
英文:

開催地

和文:
英文:

ファイル

公式リンク

http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6609068

DOI

http://dx.doi.org/10.1109/JSTSP.2013.2283461

アブストラクト

This paper proposes a statistical parametric speech synthesis technique based on Gaussian process regression (GPR). The GPR model is designed for directly predicting frame-level acoustic features from corresponding information on frame context that is obtained from linguistic information. The frame context includes the relative position of the current frame within the phone and articulatory information and is used as the explanatory variable in GPR. Here, we introduce cluster-based sparse Gaussian processes (GPs), i.e., local GPs and partially independent conditional (PIC) approximation, to reduce the computational cost. The experimental results for both isolated phone synthesis and full-sentence continuous speech synthesis revealed that the proposed GPR-based technique without dynamic features slightly outperformed the conventional hidden Markov model (HMM)-based speech synthesis using minimum generation error training with dynamic features.

Home

各種検索

サポート

T2R2について

関連リンク

論文・著書情報