Skip to main content
2024 47th International Conference on Telecommunications and Signal Processing (TSP)

Full Program »
Video (.mp4)
View File

Applying Phonological Feature Embeddings for Cross-Lingual Transfer in Text-to-Speech

In this work, we build upon our previous research where we introduced phonological features as input to text-to-speech systems. While the use of phonological features is not a novel concept in our research, our focus in this study is on the comprehensive analysis of the embeddings produced by the encoder model, which we believe offers novel insights into the model's ability to capture and generalize phonological patterns across languages. Cross-lingual transfer experiments are conducted using both a resource-rich and a resource-constrained language to explore the model's cross-lingual transfer capabilities across different linguistic families. The analysis of the embedding vectors produced by the encoder model is conducted using cluster maps to visualize the hierarchical clusters obtained using a clustering procedure. This analysis reveals the model's learning patterns and provides insights into how phonological features contribute to the model's ability to handle linguistic diversity and data scarcity.

Johannes Louw
Council for Scientific and Industrial Research (CSIR)
South Africa

Zenghui Wang
University of South Africa
South Africa


Privacy Policy

Powered by OpenConf®
Copyright ©2002-2024 Zakon Group LLC