On the Quality Improvement of Voice Conversion Systems Based on GMM Model

	On the Quality Improvement of Voice Conversion Systems Based on GMM Model
The Modares Journal of Electrical Engineering
Article 3, Volume 5, Issue 0, 2005, Pages 23-36 PDF (2.49 M)
Authors
Mahdi Eslami; abolghasem sayadian
Amirkabir Univ. of Tech.
Abstract
In a voice conversion system speech signal of A speaker (i.e. source speaker) is modified so that it sounds as if it had been pronounced by B speaker (i.e. target speaker). This process, sometimes, is called speaker conversion (changing speaker identity). Achieved signal from speaker conversion system is desired to have high quality and very natural. To satisfy this, three major methods are proposed as follows: VQ_based, LMR_based and GMM_based voice conversion methods. DTW is the most popular way to warp corresponded words in two sentences. In this paper, DTW is used to design corresponding transfer function. To decrease the distance between two speakers, DTW warps the couple phonemes of two speakers, instead of two words or couple sentences while a linear temporal transform which depends on phonemes is used for error decreasing. By using other appropriate corrections that are used in learning and designing of the linear transforms, a high quality voice conversion system is achieved.
Keywords
Voice Conversion; Speaker Transformation; Spectral Mapping; Gaussian Mixture Model
Statistics Article View: 86 PDF Download: 49