Skip to main content
2024 47th International Conference on Telecommunications and Signal Processing (TSP)

Full Program »
Video (.mp4)
View File
mp4
59.5MB

Synthetic Speech Detection using Deep Neural Networks

In recent years, the proliferation of synthetic speech has raised concerns regarding its potential misuse for unethical activities including voice impersonation and deepfake generation. Addressing this challenge requires robust methods for detecting synthetic speech, which often exhibits subtle but discernible differences from natural speech. In this paper, three approaches for synthetic speech detection are proposed, two based on deep neural networks (DNNs), namely multilayer perceptrons (MLPs), convolutional neural networks (CNNs), and one based on an EfficientNetV2 model and transfer learning. The proposed system was trained on the Fake or Real (FoR) dataset, comprising utterances generated by some of the latest speech synthesis algorithms, and is able to generalize well on unseen samples generated with algorithms not encountered during training, yielding a validation accuracy of 98.9% and a test accuracy of 83.9%.

Irina Mutica
National University of Science and Technology POLITEHNICA Bucharest
Romania

Serban Mihalache
National University of Science and Technology POLITEHNICA Bucharest
Romania

Dragos Burileanu
National University of Science and Technology POLITEHNICA Bucharest
Romania

 

Privacy Policy

Powered by OpenConf®
Copyright ©2002-2024 Zakon Group LLC