ECE443 Speech and Audio Processing

Scientific Responsible	Stamoulis Georgios, Professor E-mail: georges@uth.gr
Title	Hellenic Chips Competence Centre (HCCC)
Funding Agency	Το HCCC υποστηρίζεται από το Chips JU και τα μέλη του, και συγχρηματοδοτείται από την Ευρωπαϊκή Ένωση και την Ελληνική Κυβέρνηση μέσω του προγράμματος “Ανταγωνιστικότητα”
Budget	326.350,00
Duration	01/06/2025 – 31/05/2029

Scientific Responsible	Plessas Fotios, Professor E-mail: fplessas@uth.gr
Title	Αναλογικός Σχεδιασμός, Δοκιμές και Επαλήθευση
Funding Agency	NanoZeta Technologies ltd.
Budget	271.400,00
Duration	26/01/2021 – 25/01/2028

Scientific Responsible	Korakis Athanasios, Professor E-mail: korakis@uth.gr
Title	DIGITAfrica: Towards a comprehensive pan-African research infrastructure in Digital Sciences
Funding Agency	ΕΥΡΩΠΑΪΚΗ ΕΝΩΣΗ
Budget	123.125,00
Duration	16/12/2024 – 31/12/2027

Department of Electrical and Computer Engineering
Sekeri – Cheiden Str Pedion Areos, ECE Building 383 34 Volos – Greece
Tel.	+30 24210 74967, +30 24210 74934
e-mail	gece ΑΤ uth.gr
PGS Tel.	+30 24210 74933
PGS e-mail	pgsec ΑΤ uth.gr
URL	https://www.e-ce.uth.gr/contact-info/?lang=en

Subject Area	Signals, Communications, and Networking
Semester	Semester 7 – Fall
Type	Elective
Teaching Hours	4
ECTS	6
Prerequisites	ECE218 Signals and Systems
Recommended Courses	ECE334 Pattern Recognition
Course Site	https://eclass.uth.gr/courses/E-CE_U_318/
Course Director	Gerasimos Potamianos, Associate Professor E-mail: gpotamianos@uth.gr

Description
Learning Outcomes

The course covers basic concepts in speech and audio processing, with its main focus being human speech, in particular its production, perception, representation, coding, synthesis, and recognition. In addition, processing of audio signals, in particular of music signals, is also covered. In summary, the course covers the following topics:

Introduction to digital speech processing.
A brief review of fundamentals of digital signal processing.
Fundamentals of human speech production and sound propagation in the human vocal tract.
Hearing, auditory models, and speech perception.
Time-domain methods for speech processing.
Frequency domain representation.
Homomorphic speech processing and cepstrum.
Linear predictive analysis of speech signals.
Algorithms for estimating speech parameters.
Digital coding of speech signals.
Frequency domain coding of speech and audio.
Text-to-speech synthesis.
Automatic speech recognition using hidden Markov models.
Feature extraction and recognition of music signals.
Basic computational tools in Matlab corresponding to the above (including the MIR and OpenSMILE toolboxes).
Brief introduction to the hidden Markov model toolkit (HTK).

This course introduces students to the basic concepts and algorithms in speech and audio processing, with its main focus being human speech, but also covering more general audio signals, in particular music ones. The course also provides numerous examples to allow student familiarization with the above, as well as practical computational tools within the Matlab and HTK software frameworks, further demonstrating these.

The course provides further specialization to the students, as a continuation of the digital signal processing and pattern recognition courses, allowing them to further delve into the study of the specific signals (speech, audio).

Students successfully completing this class will have mastered the main concepts, algorithms, and tools in the processing and recognition of speech and more general audio signals. For example, they will be able to:

Understand the process of human speech production and perception.
Extract appropriate features from speech signals in various domains and select the most suitable among them for the particular problem at hand.
Be able to perform speech recognition and speech synthesis with basic algorithms.
Extract a variety of features from music signals.
Implement programs in Matlab / OpenSMILE / HTK to perform the aforementioned tasks.

Active Research Projects

Hellenic Chips Competence Centre (HCCC)

Αναλογικός Σχεδιασμός, Δοκιμές και Επαλήθευση

DIGITAfrica: Towards a comprehensive pan-African research infrastructure in Digital Sciences

e-Yπηρεσίες