Επιστ. Υπεύθυνος	Πλέσσας Φώτιος, Καθηγητής E-mail: fplessas@e-ce.uth.gr
Τίτλος	Αναλογικός Σχεδιασμός, Δοκιμές και Επαλήθευση
Φορέας Χρηματοδότησης	NanoZeta Technologies ltd.
Προϋπολογισμός	271.400,00
Διάρκεια	26/01/2021 – 25/01/2028

Επιστ. Υπεύθυνος	Κοράκης Αθανάσιος, Καθηγητής E-mail: korakis@e-ce.uth.gr
Τίτλος	DIGITAfrica: Towards a comprehensive pan-African research infrastructure in Digital Sciences
Φορέας Χρηματοδότησης	ΕΥΡΩΠΑΪΚΗ ΕΝΩΣΗ
Προϋπολογισμός	123.125,00
Διάρκεια	16/12/2024 – 31/12/2027

Επιστ. Υπεύθυνος	Σωτηρίου Χρήστος, Καθηγητής E-mail: chsotiriou@e-ce.uth.gr
Τίτλος	TWIN-RELECT: Twinning for Excellence in Reliable Electronics
Φορέας Χρηματοδότησης	ΕΥΡΩΠΑΪΚΗ ΕΝΩΣΗ
Προϋπολογισμός	602.500,00
Διάρκεια	01/10/2024 – 30/09/2027

Τμήμα Ηλεκτρολόγων Μηχανικών & Μηχανικών Υπολογιστών
Σέκερη και Χέυδεν Πεδίον Άρεως, κτίριο ΤμΗΜΜΥ ΤΚ 383 34, Βόλος
Τηλ.	+30 24210 74967, +30 24210 74934
e-mail	gece ΑΤ e-ce.uth.gr
Τηλ. Π.Μ.Σ.	+30 24210 74933
e-mail Π.Μ.Σ.	pgsec ΑΤ e-ce.uth.gr
Ιστοσελίδα	https://www.e-ce.uth.gr/contact-info/

	Ποταμιάνος Γεράσιμος, Αναπληρωτής Καθηγητής

	Γνωστικό Αντικείμενο	Πολυτροπική (οπτικο-ακουστική) επεξεργασία φωνής, επικοινωνία ανθρώπου-μηχανής, διάχυτη νοημοσύνη, αυτόματη αναγνώριση φωνής, ψηφιακή επεξεργασία σήματος, εφαρμογές όρασης υπολογιστών
	Δίπλωμα	Ηλεκτρολόγου Μηχανικού, ΕΜΠ
	Διδακτορικό	Ηλεκτρολόγου Μηχανικού, Johns Hopkins University
	Γραφείο	420
	Ώρες Γραφείου	Τετάρτη 13:00 - 14:00 (ή κατόπιν συνεννόησης)
	Τηλ.	+30 24210 74928
	Email	gpotamianos@e-ce.uth.gr
	Ιστοσελίδα CV Google Scholar

Loading…

Δομή Προαπαιτούμενων Μαθημάτων

Xρώμα κόμβου:
1ο Έτος 2ο Έτος 3ο Έτος 4ο-5ο Έτος

Σχήμα Κόμβου:
Κύκλος: Υποχρεωτικό Μάθημα
Τετράγωνο: Μάθημα Επιλογής
Αστεράκι: Μάθημα για το οποίο γίνεται η αναζήτηση

Σύρσιμο Κόμβου:
Κάνοντας κλίκ στον κόμβο και μετακινώντας το ποντίκι.

Μεγένθυση & Μετακίνηση Γραφήματος:
Κάνοντας κύλιση (scrolling) και σύρσιμο (dragging) του ποντικιού.

Σύντομο Βιογραφικό
Δημοσιεύσεις
Μαθήματα
Διδάκτορες

Ο Γεράσιμος Ποταμιάνος γεννήθηκε στην Αθήνα το 1965. Το 1988 έλαβε το δίπλωμα του Ηλεκτρολόγου Μηχανικού από το Εθνικό Μετσόβιο Πολυτεχνείο. Στη συνέχεια ολοκλήρωσε τις μεταπτυχιακές του σπουδές στο Πανεπιστήμιο Johns Hopkins της Βαλτιμόρης των Ηνωμένων Πολιτειών, όπου έλαβε το πτυχίο Master το 1990 και το διδακτορικό του το 1994. Το αντικείμενο της διδακτορικής του διατριβής επικεντρώθηκε σε Μαρκοβιανά μοντέλα για επεξεργασία εικόνας, υπό την καθοδήγηση του καθ. Ιωάννη Γούτσια. Στη συνέχεια, εργάστηκε ως μεταδιδακτορικός ερευνητής υπό τον καθ. Frederick Jelinek στο Κέντρο Επεξεργασίας Λόγου του Πανεπιστημίου Johns Hopkins έως το 1996, μελετώντας γλωσσικά μοντέλα χρησιμοποιώντας δέντρα αποφάσεων. Το 1996 προσελήφθη ως ερευνητικό μέλος στο Τμήμα Υπηρεσιών Λόγου και Επεξεργασίας Εικόνας των ερευνητικών εργαστηρίων της ΑΤ&Τ (AT&T Labs-Research) στο Murray Hill και Florham Park της Νέας Υερσέης, όπου δούλεψε στα θέματα της οπτικο-ακουστικής αναγνώρισης και σύνθεσης φωνής. Το 1999 προσελήφθη στο Τμήμα Τεχνολογιών Ανθρωπίνου Λόγου στο κέντρο ερευνών της ΙΒΜ (IBM T.J. Watson Research Center) στην πολιτεία της Νέας Υόρκης, όπου τελικά προήχθη σε προϊστάμενο της ομάδος Εφαρμοσμένων Πολυτροπικών Διαλογικών Συστημάτων. Εκεί συνέχισε την έρευνά του στα προβλήματα της οπτικο-ακουστικής φωνής, με πιο πρόσφατη έμφαση στην πολυτροπική επεξεργασία σήματος σε περιβάλλοντα έξυπνων χώρων και διάχυτης νοημοσύνης, εντός του πλαισίου των Ευρωπαϊκών ερευνητικών έργων CHIL, DICIT, και NETCARITY. Τον Οκτώβριο του 2008 διορίστηκε μόνιμος Ερευνητής Βαθμίδας Β΄ στο Ινστιτούτο Πληροφορικής του ΙΤΕ, στην Κρήτη και τον Απρίλιο του 2009 Διευθυντής Ερευνών (Ερευνητής Βαθμίδας Α΄) στο Ινστιτούτο Πληροφορικής και Τηλεπικοινωνιών του Ε.ΚΕ.Φ.Ε. «Δημόκριτος». Από τον Φεβρουάριο του 2012 είναι Αναπληρωτής Καθηγητής στο Τμήμα Ηλεκτρολόγων Μηχανικών και Μηχανικών Υπολογιστών (ΤΗΜΜΥ) της Πολυτεχνικής Σχολής του Πανεπιστημίου Θεσσαλίας στο Βόλο.

Στις επιστημονικές του δραστηριότητες συγκαταλέγονται η συμμετοχή του στο εργαστηριακό πρόγραμμα (workshop) του Κέντρου Επεξεργασίας Λόγου του Πανεπιστημίου Johns Hopkins το καλοκαίρι του 2000, διδασκαλία στην καλοκαιρινές σχολές του δικτύου ELSNET το 2001, Ε.ΚΕ.Φ.Ε. «Δημόκριτος» το 2009 και 2010, και ISCA 2014, παιδαγωγική διάλεξη (tutorial) στο διεθνές συνέδριο επεξεργασίας εικόνας (ICIP) το 2003, προσκεκλημένες διαλέξεις στα συνέδρια AVSP 2003, VisHCI 2006, ASRU 2009, ACCVW 2016, και LISTEN 2018, συμμετοχή σε πάνελ στο συνέδριο MMSP 2006, επιμέλεια έκδοσης τεύχους των περιοδικών EURASIP JASP 2002, IEEE TASLP 2009, και IEEE TM 2011, και επιμέλεια έκδοσης σειράς τριών τόμων της ACM με αντικείμενο την πολυτροπική / πολυαισθητηριακή διεπαφή ανθρώπου-μηχανής. Επίσης έλαβε βραβείο καλύτερου Άρθρου στα επιστημονικά συνέδρια ICME 2005 και Interspeech 2007 (το δεύτερο για εργασία φοιτητή του). Έχει επιτελέσει μέλος της Τεχνικής Επιτροπής Τεχνολογίας Φωνής και Γλώσσας του ΙΕΕΕ (2009-2011) και μέλος οργανωτικής επιτροπής πολλών συνεδρίων. Του έχει επίσης απονεμηθεί χορηγία στο πλαίσιο του Διεθνούς Προγράμματος Επανασύνδεσης Ερευνητών (Marie Curie International Reintegration Grant) για το έργο AVISPIRE. Έχει τέλος επιδείξει σημαντική συμμετοχή σε Ευρωπαϊκά ερευνητικά έργα, όπως στα έργα CHIL, DICIT, NETCARITY, και INDIGO του FP6, και πιο πρόσφατα στα SYNC3 (Τεχνικός Συντονιστής έργου) και DIRHA του FP7, BabyRobot του H2020, όπως επίσης και στα Εθνικά έργα e-Prevention και TeachBot. Πρόσφατα επίσης, έχει επιτελέσει επιστημονικός υπεύθυνος του Εθνικού έργου (ΕΛ.ΙΔ.Ε.Κ.) SL-ReDu.

Τα ερευνητικά του ενδιαφέροντα εντοπίζονται στα θέματα της πολυτροπικής επεξεργασία φωνής με εφαρμογές σε επικοινωνία ανθρώπου-μηχανής και διάχυτη νοημοσύνη, με ιδιαίτερη έμφαση στην οπτικο-ακουστική επεξεργασία αυτής, αυτόματη αναγνώριση φωνής και ακουστικού περιβάλλοντος, πολυτροπική επεξεργασία σημάτων και αλγόριθμους συνδυασμού τους, όπως επίσης και εφαρμογές υπολογιστικής όρασης για ανίχνευση προσώπου, ανθρώπινης δραστηριότητας και νοηματικής γλώσσας. Έχει δημοσιεύσει 165 άρθρα στις ερευνητικές αυτές περιοχές με περισσότερες από 7.100 αναφορές στη διεθνή βιβλιογραφία (h-index 43 και i10-index 121) σύμφωνα με το Google Scholar, και έχει επτά πατέντες. Τέλος, είναι μέλος των ΙΕΕΕ, EURASIP, ISCA, και του Τεχνικού Επιμελητηρίου της Ελλάδος.

[Τελευταία Επικαιροποίηση: 04/04/2024]

[Last update: 04/04/2024 -- Τελευταία Επικαιροποίηση: 04/04/2024]

2023

K. Papadimitriou, G. Potamianos, G. Sapountzaki, T. Goulas, E. Efthimiou, S.-E. Fotinea, and P. Maragos, “Greek sign language recognition for an education platform,” Universal Access in the Information Society, 2023.
K. Papadimitriou and G. Potamianos, “Multimodal locally enhanced Transformer for continuous sign language recognition,” Proc. Conf. Int. Speech Comm. Assoc. (Interspeech), pp. 1513–1517, Dublin, Ireland, 2023.
K. Papadimitriou and G. Potamianos, “Sign language recognition via deformable 3D convolutions and modulated graph convolutional networks,” Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp. 1–5, Rhodes, Greece, 2023.
K. Papadimitriou, G. Sapountzaki, K. Vasilaki, E. Efthimiou, S.-E. Fotinea, and G. Potamianos, “SL-REDU GSL: A large Greek sign language recognition corpus,” Proc. Int. Conf. Acoust. Speech Signal Process. Works. (ICASSPW) – Int. Works. Sign Language Translation and Avatar Technology (SLTAT), pp. 1–5, Rhodes, Greece, 2023.
G. Sapountzaki, E. Efthimiou, S.-E. Fotinea, K. Papadimitriou, and G. Potamianos, “Remote learning and assessment of Greek sign language in the undergraduate curriculum in COVID time,” Proc. Int. Conf. Education and New Learning Technologies (EDULEARN), pp. 5452–5459, Palma, Spain, 2023.

2022

N. Efthymiou, P. P. Filntisis, P. Koutras, A. Tsiami, J. Hadfield, G. Potamianos, and P. Maragos, “ChildBot: Multi-robot perception and interaction with children,” Robotics and Autonomous Systems, vol. 150 (103975), 2022.
M. Parelli, K. Papadimitriou, G. Potamianos, G. Pavlakos, and P. Maragos, “Spatio-temporal graph convolutional networks for continuous sign language recognition,” Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp. 8457–8461, Singapore, 2022.
K. Papadimitriou, G. Potamianos, G. Sapountzaki, T. Goulas, E. Efthimiou, S.-E. Fotinea, and P. Maragos, “Greek sign language recognition for the SL-ReDu learning platform,” Proc. Int. Work. Sign Language Translation and Avatar Technology (SLTAT) – Satellite to LREC, pp. 79–84, Marseille, France, 2022.
G. Sapountzaki, E. Efthimiou, S.-E. Fotinea, K. Papadimitriou, and G. Potamianos, “3D Greek sign language classifiers as a learning object in the SL-ReDu online education platform,” Proc. Int. Conf. Education and New Learning Technologies (EDULEARN), pp. 6146–6153, Palma, Spain, 2022.

2021

S. Thermos, G. Potamianos, and P. Darras, “Joint object affordance reasoning and segmentation in RGB-D videos,” IEEE Access, vol. 9, pp. 89699–89713, 2021.
N. Efthymiou, P. P. Filntisis, G. Potamianos, and P. Maragos, “Visual robotic perception system with incremental learning for child-robot interaction scenarios,” Technologies, vol. 9(4), no. 86, 2021.
P. Giannoulis, G. Potamianos, and P. Maragos, “Overlapped sound event classification via multi-channel sound separation network,” Proc. Europ. Conf. Signal Process. (EUSIPCO), pp. 571–575, Dublin, Ireland, 2021.
A. Koumparoulis, G. Potamianos, S. Thomas, and E. da Silva Morais, “Resource-efficient TDNN architectures for audio-visual speech recognition,” Proc. Europ. Conf. Signal Process. (EUSIPCO), pp. 506–510, Dublin, Ireland, 2021.
P. P. Filntisis, N. Efthymiou, G. Potamianos, and P. Maragos, “An audiovisual child emotion recognition system for child-robot interaction applications,” Proc. Europ. Conf. Signal Process. (EUSIPCO), pp. 791–795, Dublin, Ireland, 2021.
N. Efthymiou, P. P. Filntisis, G. Potamianos, and P. Maragos, “A robotic edutainment framework for designing childrobot interaction scenarios,” Proc. ACM Int. Conf. PErvarsive Technologies Related to Assistive Environments (PETRA), pp. 160–166, Corfu, Greece, 2021.
K. Papadimitriou, M. Parelli, G. Sapountzaki, G. Pavlakos, P. Maragos, and G. Potamianos, “Multimodal fusion and sequence learning for cued speech recognition from videos,” Universal Access in Human-Computer Interaction. Access to Media, Learning and Assistive Environments. UAHCI 2021 / HCII 2021, Part II, M. Antona and C. Stephanidis (Eds.), pp. 277–290, LNCS vol. 12769, Springer, Cham, 2021.
E. Efthimiou, S.-E. Fotinea, C. Flouda, T. Goulas, G. Ametoglou, G. Sapountzaki, K. Papadimitriou, and G. Potamianos, “The SL-ReDu environment for self-monitoring and objective learner assessment in Greek sign language,” Universal Access in Human-Computer Interaction. Access to Media, Learning and Assistive Environments. UAHCI 2021 / HCII 2021, Part II, M. Antona and C. Stephanidis (Eds.), pp. 72–81, LNCS vol. 12769, Springer, Cham, 2021.
G. Sapountzaki, E. Efthimiou, S.-E. Fotinea, K. Papadimitriou, and G. Potamianos, “Educational material organization in a platform for Greek sign language self monitoring and assessment,” Proc. Int. Conf. Education and New Learning Technologies (EDULEARN), pp. 3322–3331, Palma, Spain, 2021.

2020

S. Thermos, G. T. Papadopoulos, P. Darras, and G. Potamianos, “Deep sensorimotor learning for RGB-D object recognition,” Computer Vision and Image Understanding, vol. 190, 2020.
A. Koumparoulis, G. Potamianos, S. Thomas, and E. da Silva Morais, “Resource-adaptive deep learning for visual speech recognition,” Proc. Conf. Int. Speech Comm. Assoc. (Interspeech), pp. 3510–3514, Shanghai, China, 2020.
K. Papadimitriou and G. Potamianos, “Multimodal sign language recognition via temporal deformable convolutional sequence learning,” Proc. Conf. Int. Speech Comm. Assoc. (Interspeech), pp. 2752–2756, Shanghai, China, 2020.
K. Papadimitriou and G. Potamianos, “A fully convolutional sequence learning approach for cued speech recognition from videos,” Proc. Europ. Conf. Signal Process. (EUSIPCO), pp. 326–330, Amsterdam, The Netherlands, 2020.
M. Parelli, K. Papadimitriou, G. Potamianos, G. Pavlakos, and P. Maragos, “Exploiting 3D hand pose estimation in deep learning-based sign language recognition from RGB videos,” Computer Vision – ECCV 2020 Workshops Proceedings,
Part II, A. Bartoli and A. Fusiello (Eds.), pp. 249–263, LNCS/LNIP vol. 12536, 2020.
P. P. Filntisis, N. Efthymiou, G. Potamianos, and P. Maragos, “Emotion understanding in videos through body, context, and visual-semantic embedding loss,” Computer Vision – ECCV 2020 Workshops Proceedings, Part I, A. Bartoli and A.
Fusiello (Eds.), pp. 747–755, LNCS/LNIP vol. 12535, Springer, 2020.
G. Potamianos, K. Papadimitriou, E. Efthimiou, S.-E. Fotinea, G. Sapountzaki, and P. Maragos, “SL-ReDu: Greek sign language recognition for educational applications. Project description and early results,” Proc. ACM Int. Conf. PErvarsive
Technologies Related to Assistive Environments (PETRA), article no. 59, pp. 1–6, Corfu, Greece, 2020.
A. Koumparoulis, G. Potamianos, S. Thomas, and E. Morais, “Audio-assisted image inpainting for talking faces,” Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp. 7664–7668, Barcelona, Spain, 2020.
S. Thermos, P. Darras, and G. Potamianos, “A deep learning approach to object affordance segmentation,” Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp. 2358–2362, Barcelona, Spain, 2020.

2019

P. Giannoulis, G. Potamianos, and P. Maragos, “Room-localized speech activity detection in multi-microphone smart homes,” EURASIP Journal on Audio, Speech, and Music Processing, vol. 2019, no. 15, pp. 1–23, 2019.
P. P. Filntisis, N. Efthymiou, P. Koutras, G. Potamianos, and P. Maragos, “Fusing body posture with facial expressions for joint recognition of affect in child-robot interaction,” IEEE Robotics and Automation Letters, vol. 4, no. 4, pp. 4011–4018, 2019.
S. Oviatt, B. Schuller, P. Cohen, D. Sonntag, G. Potamianos, and A. Kruger (Eds.), The Handbook of Multimodal-Multisensor Interfaces, Volume 3: Language Processing, Software and Commercialization, Emerging Directions, ACM Books / Morgan-Claypool Publishers, San Rafael, CA, 2019.
S. P. Chytas and G. Potamianos, “Hierarchical detection of sound events and their localization using convolutional neural networks with adaptive thresholds,” Proc. Detection and Classification of Acoustic Scenes and Events Workshop (DCASE), pp. 50–54, New York, NY, 2019.
A. Koumparoulis and G. Potamianos, “MobiLipNet: Resource-efficient deep learning based lipreading,” Proc. Conf. Int. Speech Comm. Assoc. (Interspeech), pp. 2763–2767, Graz, Austria, 2019.
K. Papadimitriou and G. Potamianos, “End-to-end convolutional sequence learning for ASL fingerspelling recognition,” Proc. Conf. Int. Speech Comm. Assoc. (Interspeech), pp. 2315–2319, Graz, Austria, 2019.
K. Papadimitriou and G. Potamianos, “Fingerspelled alphabet sign recognition in upper-body videos,” Proc. Europ. Conf. Signal Process. (EUSIPCO), La Coruna, Spain, 2019.

2018

S. Oviatt, B. Schuller, P. Cohen, D. Sonntag, G. Potamianos, and A. Kruger (Eds.), The Handbook of Multimodal-Multisensor Interfaces, Volume 2: Signal Processing, Architectures, and Detection of Emotion and Cognition, ACM Books / Morgan-Claypool Publishers, San Rafael, CA, 2018.
A. Koumparoulis and G. Potamianos, “Deep view2view mapping for view-invariant lipreading,” Proc. IEEE Spoken Language Technology Works. (SLT), pp. 588–594, Athens, Greece, 2018.
K. Papadimitriou and G. Potamianos, “A hybrid approach to hand detection and type classification in upper-body videos,” Proc. Europ. Work. Visual Information Process. (EUVIP), Tampere, Finland, 2018.
J. Hadfield, P. Koutras, N. Efthymiou, G. Potamianos, C. S. Tzafestas, and P. Maragos, “Object assembly guidance in child-robot interaction using RGB-D based 3D tracking,” Proc Int. Conf. Intell. Robots Systems (IROS), pp. 347–354, Madrid, Spain, 2018.
P. Giannoulis, G. Potamianos, and P. Maragos, “Multi-channel non-negative matrix factorization for overlapped acoustic event detection,” Proc. Europ. Conf. Signal Process. (EUSIPCO), pp. 857–861, Rome, Italy, 2018.
N. Efthymiou, P. Koutras, P. P. Filntisis, G. Potamianos, and P. Maragos, “Multi-view fusion for action recognition in child-robot interaction,” Proc. Int. Conf. Image Process. (ICIP), pp. 455–459, Athens, Greece, 2018.
S. Thermos, G. T. Papadopoulos, P. Darras, and G. Potamianos, “Attention-enhanced sensorimotor object recognition,” Proc. Int. Conf. Image Process. (ICIP), pp. 336–340, Athens, Greece, 2018.
A. Tsiami, P. Koutras, N. Efthymiou, P. P. Filntisis, G. Potamianos, and P. Maragos, “Multi3: Multi-sensory perception system for multi-modal child interaction with multiple robots,” Proc. IEEE Int. Conf. on Robotics and Automation (ICRA), Brisbane, Australia, 2018.
A. Tsiami, P. P. Filntisis, N. Efthymiou, P. Koutras, G. Potamianos, and P. Maragos, “Far-field audio-visual scene perception of multi-party human-robot interaction for children and adults,” Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp. 6568–6572, Calgary, Canada, 2018.

2017

I. Rodomagoulakis, A. Katsamanis, G. Potamianos, P. Giannoulis, A. Tsiami, and P. Maragos, “Room-localized spoken command recognition in multi-room, multi-microphone environments,” Computer Speech and Language, vol. 46, pp. 419–443, 2017.
G. Potamianos, E. Marcheret, Y. Mroueh, V. Goel, A. Koumbaroulis, A. Vartholomaios, and S. Thermos, “Audio and visual modality combination in speech processing applications”, The Handbook of Multimodal-Multisensor Interfaces, Volume 1: Foundations, User Modeling, and Common Modality Combinations, S. Oviatt, B. Schuller, P. Cohen, D. Sonntag, G. Potamianos, and A. Kruger (Eds.), Ch. 12, pp. 489–543, ACM Books / Morgan-Claypool Publishers, San Rafael, CA, 2017.
S. Oviatt, B. Schuller, P. Cohen, D. Sonntag, G. Potamianos, and A. Kruger (Eds.), The Handbook of Multimodal-Multisensor Interfaces, Volume 1: Foundations, User Modeling, and Common Modality Combinations, ACM Books / Morgan-Claypool Publishers, San Rafael, CA, 2017.
A. Koumparoulis, G. Potamianos, Y. Mroueh, and S. J. Rennie, “Exploring ROI size in deep learning based lipreading,” Proc. Int. Conf. on Auditory-Visual Speech Process. (AVSP), pp. 64–69, Stockholm, Sweden, 2017.
P. Giannoulis, G. Potamianos, and P. Maragos, “On the joint use of NMF and classification for overlapping acoustic event detection,” Proc. Int. Works. on Computational Intelligence for Multimedia Understanding (IWCIM), Held in Conjunction with Europ. Conf. Signal Process. (EUSIPCO), Kos Island, Greece, 2017.
S. Thermos, G. T. Papadopoulos, P. Darras, and G. Potamianos, “Deep affordance-grounded sensorimotor object recognition,” Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 49–57, Honolulu, HI, 2017.

2016

S. Thermos and G. Potamianos, “Audio-visual speech activity detection in a two-speaker scenario incorporating depth information from a profile or frontal view,” Proc. IEEE Spoken Language Technology Works. (SLT), pp. 579–584, San Diego, CA, 2016.
P. Giannoulis, G. Potamianos, P. Maragos, and A. Katsamanis, “Improved dictionary selection and detection schemes in sparse-CNMF-based overlapping acoustic event detection,” Proc. Detection and Classification of Acoustic Scenes and Events Works. (DCASE), pp. 25–29, Budapest, Hungary, 2016.
A. Tsiami, A. Katsamanis, I. Rodomagoulakis, G. Potamianos, and P. Maragos, “Sweet home listen: A distant speech recognition system for home automation control,” “Show and Tell” Session Demonstration, Int. Conf. Acoust. Speech Signal Process. (ICASSP), Shanghai, China, 2016.

2015

E. Marcheret, G. Potamianos, J. Vopicka, and V. Goel, “Scattering vs. discrete cosine transform features in visual speech processing,” Proc. Int. Joint Conf. on Facial Analysis, Animation and Auditory-Visual Speech Process. (FAAVSP), pp. 175–180, Vienna, Austria, 2015.
E. Marcheret, G. Potamianos, J. Vopicka, and V. Goel, “Detecting audio-visual synchrony using deep neural networks,” Proc. Conf. Int. Speech Comm. Assoc. (Interspeech), pp. 548–552, Dresden, Germany, 2015.
P. Giannoulis, A. Brutti, M. Matassoni, A. Abad, A. Katsamanis, M. Matos, G. Potamianos, and P. Maragos, “Multiroom speech activity detection using a distributed microphone network in domestic environments,” Proc. Europ. Conf. Signal Process. (EUSIPCO), pp. 1271–1275, Nice, France, 2015.
Z. I. Skordilis, A. Tsiami, P. Maragos, G. Potamianos, L. Spelgatti, and R. Sannino, “Multichannel speech enhancement using MEMS microphones,” Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp. 2729–2733, Brisbane, Australia, 2015.

2014

G. Floros, K. Kyritsis, and G. Potamianos, “Database and baseline system for detecting degraded traffic signs in urban environments,” Proc. Europ. Work. Visual Information Process. (EUVIP), Paris, France, 2014.
A. Tsiami, I. Rodomagoulakis, P. Giannoulis, A. Katsamanis, G. Potamianos, and P. Maragos, “ATHENA: A Greek multisensory database for home automation control,” Proc. Conf. Int. Speech Comm. Assoc. (Interspeech), pp. 1608–1612, Singapore, 2014.
P. Giannoulis, G. Potamianos, A. Katsamanis, and P. Maragos, “Multi-microphone fusion for detection of speech and acoustic events in smart spaces,” Proc. Europ. Conf. Signal Process. (EUSIPCO), pp. 2375–2379, Lisbon, Portugal, 2014.
A. Tsiami, A. Katsamanis, P. Maragos, and G. Potamianos, “Experiments in acoustic source localization using sparse arrays in adverse indoors environments, Proc. Europ. Conf. Signal Process. (EUSIPCO), pp. 2390–2394, Lisbon, Portugal, 2014.
P. Giannoulis, A. Tsiami, I. Rodomagoulakis, A. Katsamanis, G. Potamianos, and P. Maragos, “The Athena-RC system for speech activity detection and speaker localization in the DIRHA smart home,” Proc. Joint Work. Hands-Free Speech Communication and Microphone Arrays (HSCMA), pp. 167–171, Nancy, France, 2014.
A. Katsamanis, I. Rodomagoulakis, G. Potamianos, P. Maragos, and A. Tsiami, “Robust far-field spoken command recognition for home automation combining adaptation and multichannel processing,” Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp. 5547–5551, Florence, Italy, 2014.

2013

I. Rodomagoulakis, G. Potamianos, and P. Maragos, “Advances in large vocabulary continuous speech recognition in Greek: Modeling and nonlinear features,” Proc. Europ. Conf. Signal Process. (EUSIPCO), Marrakech, Morocco, 2013.
I. Rodomagoulakis, P. Giannoulis, Z.-I. Skordilis, P. Maragos, and G. Potamianos, “Experiments on far-field multichannel speech processing in smart homes,” Proc. Int. Conf. Digital Signal Process. (DSP), pp. 1–6, Santorini, Greece, 2013.
G. Galatas, G. Potamianos, and F. Makedon, “Robust multi-modal speech recognition in two languages utilizing video and distance information from the Kinect,” Proc. Int. Conf. Human-Computer Interaction (HCII), Las Vegas, NV, 2013.

2012

G. Potamianos, C. Neti, J. Luettin, and I. Matthews, “Audio-visual automatic speech recognition: An overview,” Audio-Visual Speech Processing, E. Vatikiotis-Bateson, G. Bailly, and P. Perrier (Eds.), Ch. 9, Cambridge University Press, 2012.
G. Galatas, G. Potamianos, and F. Makedon, “Audio-visual speech recognition incorporating facial depth information captured by the Kinect,” Proc. Europ. Conf. Signal Process. (EUSIPCO), pp. 2714–2717, Bucharest, Romania, 2012.
G. Galatas, G. Potamianos, and F. Makedon, “Audio-visual speech recognition using depth information from the Kinect in noisy video conditions,” Proc. Int. Conf. Pervasive Technologies Related to Assistive Environments (PETRA), Crete, Greece, 2012.
P. Giannoulis and G. Potamianos, “A hierarchical approach with feature selection for emotion recognition from speech,” Proc. Int. Conf. Language Resources and Evaluation (LREC), Istanbul, Turkey, 2012.

2011

K. Kumar, G. Potamianos, J. Navratil, E. Marcheret, and V. Libal, “Audio-visual speech synchrony detection by a family of bimodal linear prediction models,” Multibiometrics for Human Identification, B. Bhanu and V. Govindaraju (Eds.), Ch. 2, pp. 31–50, Cambridge University Press, 2011.
S.-H. G. Chan, J. Li, P. Frossard, and G. Potamianos, “Special section on interactive multimedia,” IEEE Transactions on Multimedia, vol. 13, no. 5, pp. 841–843, 2011.
G. Galatas, G. Potamianos, D. Kosmopoulos, C. McMurrough, and F. Makedon, “Bilingual corpus for AVASR using multiple sensors and depth information,” Proc. Int. Conf. Auditory-Visual Speech Process. (AVSP), pp. 103–106, Volterra, Italy, 2011.
N. Sarris, G. Potamianos, J.-M. Renders, C. Grover, E. Karstens, L. Kallipolitis, V. Tountopoulos, G. Petasis, A. Krithara, M. Galle, G. Jacquet, B. Alex, R. Tobin, and L. Bounegru, “A system for synergistically structuring news content from traditional media and the blogosphere,” Proc. E-Challenges Conference, Florence, Italy, 2011.
G. Galatas, G. Potamianos, A. Papangelis, and F. Makedon, “Audio visual speech recognition in noisy visual environments,” Proc. Int. Conf. Pervasive Technologies Related to Assistive Environments (PETRA), Crete, Greece, 2011.

2010

A. Waibel, R. Stiefelhagen, R. Carlson, J. Casas, J. Kleindienst, L. Lamel, O. Lanz, D. Mostefa, M. Omologo, F. Pianesi, L. Polymenakos, G. Potamianos, J. Soldatos, G. Sutschet, and J. Terken, “Computers in the human interaction loop,” Handbook of Ambient Intelligence and Smart Environments, H. Nakashima, H. Aghajan, and J.C. Augusto (Eds.), Part IX, pp. 1071–1116, Springer, 2010.
L.-H. Kim, M. Hasegawa-Johnson, G. Potamianos, and V. Libal, “Joint estimation of DOA and speech based on EM beamforming,” Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp. 121–124, Dallas, TX, 2010.

2009

P. Lucey, G. Potamianos, and S. Sridharan, “Visual speech recognition across multiple views,” Visual Speech Recognition: Lip Segmentation and Mapping, A. Wee-Chung Liew and S. Wang (Eds.), ch. X, pp. 294–325, Information Science Reference (IGI), 2009.
G. Potamianos, L. Lamel, M. Wolfel, J. Huang, E. Marcheret, C. Barras, X. Zhu, J. McDonough, J. Hernando, D. Macho, and C. Nadeu, “Automatic speech recognition,” Computers in the Human Interaction Loop, A. Waibel and R. Stiefelhagen (Eds.), ch. 6, pp. 43–59, Springer, 2009.
K. Bernardin, R. Stiefelhagen, A. Pnevmatikakis, O. Lanz, A. Brutti, J.R. Casas, and G. Potamianos, “Person tracking,” Computers in the Human Interaction Loop, A. Waibel and R. Stiefelhagen (Eds.), ch. 3, pp. 11–22, Springer, 2009.
H. Meng, S. Oviatt, G. Potamianos, and G. Rigoll, “Introduction to the special issue on multimodal processing in speech-based interactions,” IEEE Transactions on Audio, Speech, and Language Processing, vol. 17, no. 3, pp. 409–410, 2009.
K. Kumar, J. Navratil, E. Marcheret, V. Libal, and G. Potamianos, “Robust audio-visual speech synchrony detection by generalized bimodal linear prediction,” Proc. Conf. Int. Speech Comm. Assoc. (Interspeech), pp. 2251–2254, Brighton, United Kingdom, 2009.
K. Kumar, J. Navratil, E. Marcheret, V. Libal, G. Ramaswamy, and G. Potamianos, “Audio-visual speech synchronization detection using a bimodal linear prediction model,” Proc. IEEE Comp. Soc. Works. Biometrics, Held in Conjunction with CVPR, Miami Beach, FL, 2009.
V. Libal, B. Ramabhadran, N. Mana, F. Pianesi, P. Chippendale, O. Lanz, and G. Potamianos, “Multimodal classification of activities of daily living inside smart homes,” Int. Works. Ambient Assisted Living (IWAAL), LNCS vol. 5518, Part II, pp. 687–694, Salamanca, Spain, 2009.
X. Zhuang, J. Huang, G. Potamianos, and M. Hasegawa-Johnson, “Acoustic fall detection using Gaussian mixture models and GMM supervectors,” Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp. 69–72, Taipei, Taiwan, 2009.
J. Huang, X. Zhuang, V. Libal, and G. Potamianos, “Long-time span acoustic activity analysis from far-field sensors in smart homes,” Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp. 4173–4176, Taipei, Taiwan, 2009.
S.M. Chu, V. Goel, E. Marcheret, and G. Potamianos, Method for Likelihood Computation in Multi-Stream HMM Based Speech Recognition, Patent No.: US007480617B2, Jan. 20, 2009.

2008

G. Potamianos, J. Huang, E. Marcheret, V. Libal, R. Balchandran, M. Epstein, L. Seredi, M. Labsky, L. Ures, M. Black, and P. Lucey, “Far-field multimodal speech processing and conversational interaction in smart spaces,” Proc. Joint Work. Hands-Free Speech Communication and Microphone Arrays (HSCMA), Trento, Italy, 2008.
A Leone, G. Diraco, C. Distante, P. Siciliano, M. Grassi, A. Lombardi, G. Rescio, P. Malcovati, M. Malfatti, L. Gonzo, V. Libal, J. Huang, and G. Potamianos, “A multi-sensor approach for people fall detection in home environment,” Proc. 10th Europ. Conf. Computer Vision (ECCV), Marseille, France, 2008.
P. Lucey, G. Potamianos, and S. Sridharan, “Patch-based analysis of visual speech from multiple views,” Proc. Int. Conf. Auditory-Visual Speech Process. (AVSP), pp. 69–73, Tangalooma, Australia, 2008.
R. Balchandran, M. Epstein, G. Potamianos, and L. Seredi, “A multi-modal spoken dialog system for interactive TV,” Proc. Int. Conf. Multimodal Interfaces (ICMI) – Demo Papers, pp. 191–192, Chania, Greece, 2008.
M. Grassi, A. Lombardi, G. Rescio, P. Malcovati, A Leone, G. Diraco, C. Distante, P. Siciliano, M. Malfatti, L. Gonzo, V. Libal, J. Huang, and G. Potamianos, “A hardware-software framework for high-reliability people fall detection,” Proc. 7th IEEE Conf. on Sensors (SENSORS), pp. 1328–1331, Lecce, Italy, 2008.
A. Tyagi, J.W. Davis, and G. Potamianos, “Steepest descent for efficient covariance tracking,” Proc. IEEE Work. Motion and Video Computing (WMVC), Copper Mountain, Colorado, 2008.
J. Huang, E. Marcheret, K. Visweswariah, and G. Potamianos, “The IBM RT07 evaluation systems for speaker diarization on lecture meetings,” in Multimodal Technologies for Perception of Humans: International Evaluation Workshops CLEAR 2007 and RT 2007, Baltimore, Maryland, May 2007, LNCS vol. 4625, pp. 497–508, Springer, Berlin, 2008.
J. Huang, E. Marcheret, K. Visweswariah, V. Libal, and G. Potamianos, “The IBM Rich Transcription Spring 2007 speech-to-text systems for lecture meetings,” in Multimodal Technologies for Perception of Humans: International Evaluation Workshops CLEAR 2007 and RT 2007, Baltimore, Maryland, May 2007, LNCS vol. 4625, pp. 429–441, Springer, Berlin, 2008.
S. Deligne, C.V. Neti, and G. Potamianos, Audio-Visual Codebook Dependent Cepstral Normalization, Patent No.: US007319955B2, Jan. 15, 2008.

2007

D. Mostefa, N. Moreau, K. Choukri, G. Potamianos, S.M. Chu, A. Tyagi, J.R. Casas, J. Turmo, L. Christoforetti, F. Tobia, A. Pnevmatikakis, V. Mylonakis, F. Talantzis, S. Burger, R. Stiefelhagen, K. Bernardin, and C. Rochet, “The CHIL audiovisual corpus for lecture and meeting analysis inside smart rooms,” Journal of Language Resources and Evaluation, vol. 41, pp. 389–407, 2007.
Z. Zhang, G. Potamianos, A.W. Senior, and T.S. Huang, “Joint face and head tracking inside multi-camera smart rooms,” Signal, Image and Video Processing, vol. 1, pp. 163–178, 2007.
V. Libal, J. Connell, G. Potamianos, and E. Marcheret, “An embedded system for in-vehicle visual speech activity detection,” Proc. Int. Work. Multimedia Signal Process. (MMSP), pp. 255–258, Chania, Greece, 2007.
P. Lucey, G. Potamianos, and S. Sridharan, “A unified approach to multi-pose audio-visual ASR,” Proc. Conf. Int. Speech Comm. Assoc. (Interspeech), pp. 650–653, Antwerp, Belgium, 2007.
J. Huang, E. Marcheret, K. Visweswariah, V. Libal, and G. Potamianos, “Detection, diarization, and transcription of far-field lecture speech,” Proc. Conf. Int. Speech Comm. Assoc. (Interspeech), pp. 2161–2164, Antwerp, Belgium, 2007.
P. Lucey, G. Potamianos, and S. Sridharan, “Pose-invariant audio-visual automatic speech recognition,” Proc. Work. Audio-Visual Speech Process. (AVSP), pp. 176–180, Hilvarenbeek, The Netherlands, 2007.
A. Tyagi, M. Keck, J.W. Davis, and G. Potamianos, “Kernel-based 3D tracking,” Proc. IEEE Int. Work. Visual Surveillance (VS/CVPR), Minneapolis, Minnesota, 2007.
A. Tyagi, G. Potamianos, J.W. Davis, and S.M. Chu, “Fusion of multiple camera views for kernel-based 3D tracking,” Proc. IEEE Work. Motion and Video Computing (WMVC), Austin, Texas, 2007.
E. Marcheret, V. Libal, and G. Potamianos, “Dynamic stream weight modeling for audio-visual speech recognition,” Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), vol. 4, pp. 945–948, Honolulu, Hawaii, 2007.
J.H. Connell, N. Haas, E. Marcheret, C.V. Neti, and G. Potamianos, Audio-Only Backoff in Audio-Visual Speech Recognition System, Patent No.: US007251603B2, July 31, 2007.
U.V. Chaudhari, C. Neti, G. Potamianos, and G.N. Ramaswamy, Automated Decision Making Using Time-Varying Stream Reliability Prediction, Patent No.: US007228279B2, June 5, 2007.

2006

G. Potamianos, “Audio-visual speech recognition,” Short Article, Encyclopedia of Language and Linguistics, Second Edition, (Speech Technology Section – Computer Understanding of Speech), K. Brown (Ed. In Chief), Elsevier, Oxford, United Kingdom, ISBN: 0-08-044299-4, vol. 11, pp. 800–805, 2006.
G. Potamianos and P. Lucey, “Audio-visual ASR from multiple views inside smart rooms,” Proc. Int. Conf. Multisensor Fusion and Integration for Intelligent Systems (MFI), pp. 35–40, Heidelberg, Germany, 2006.
Z. Zhang, G. Potamianos, S.M. Chu, J. Tu, and T.S. Huang, “Person tracking in smart rooms using dynamic programming and adaptive subspace learning,” Proc. Int. Conf. Multimedia Expo. (ICME), pp. 2061–2064, Toronto, Canada, 2006.
P. Lucey and G. Potamianos, “Lipreading using profile versus frontal views,” Proc. IEEE Work. Multimedia Signal Process. (MMSP), pp. 24–28, Victoria, Canada, 2006.
A.W. Senior, G. Potamianos, S. Chu, Z. Zhang, and A. Hampapur, “A comparison of multicamera person-tracking algorithms,” Proc. IEEE Int. Work. Visual Surveillance (VS/ECCV), Graz, Austria, 2006.
G. Potamianos and Z. Zhang, “A joint system for single-person 2D-face and 3D-head tracking in CHIL seminars,” Multimodal Technologies for Perception of Humans: First Int. Eval. Work. on Classification of Events, Activities and Relationships, CLEAR 2006, R. Stiefelhagen and J. Garofolo (Eds.), LNCS vol. 4122, pp. 105–118, Southampton, United Kingdom, 2006.
Z. Zhang, G. Potamianos, M. Liu, and T. Huang, “Robust multi-view multi-camera face detection inside smart rooms using spatio-temporal dynamic programming,” Proc. Int. Conf. Automatic Face and Gesture Recog. (FGR), Southampton, United Kingdom, 2006.
E. Marcheret, G. Potamianos, K. Visweswariah, and J. Huang, “The IBM RT06s evaluation system for speech activity detection in CHIL seminars,” Proc. RT06s Evaluation Work. – held with Joint Work. on Multimodal Interaction and Related Machine Learning Algorithms (MLMI), S. Renals, S. Bengio, and J.G. Fiscus (Eds.), LNCS vol. 4299, pp. 323–335, Washington DC, 2006.
J. Huang, M. Westphal, S. Chen, O. Siohan, D. Povey, V. Libal, A. Soneiro, H. Schulz, T. Ross, and G. Potamianos, “The IBM Rich Transcription Spring 2006 speech-to-text system for lecture meetings,” Proc. RT06s Evaluation Work.– held with Joint Work. on Multimodal Interaction and Related Machine Learning Algorithms (MLMI), S. Renals, S. Bengio, and J.G. Fiscus (Eds.), LNCS vol. 4299, pp. 432–443, Washington DC, 2006.

2005

P.S. Aleksic, G. Potamianos, and A.K. Katsaggelos, “Exploiting visual information in automatic speech processing,” Handbook of Image and Video Processing, Second Edition, Al. Bovic (Ed.), ch. 10.8, pp. 1263–1289, Elsevier Academic Press, Burlington, MA, ISBN: 0-12-119792-1, 2005.
G. Potamianos and P. Scanlon, “Exploiting lower face symmetry in appearance-based automatic speechreading,” Proc. Work. Audio-Visual Speech Process. (AVSP), pp. 79–84, Vancouver Island, Canada, 2005.
S.M. Chu, E. Marcheret, and G. Potamianos, “Automatic speech recognition and speech activity detection in the CHIL smart room,” Proc. Joint Work. on Multimodal Interaction and Related Machine Learning Algorithms (MLMI), LNCS vol. 3869, pp. 332–343, Edinburgh, United Kingdom, 2005.
Z. Zhang, G. Potamianos, A. Senior, S. Chu, and T. Huang, “A joint system for person tracking and face detection,” Proc. Int. Work. Human-Computer Interaction (ICCV 2005 Work. on HCI), pp. 47–59, Beijing, China, 2005.
E. Marcheret, K. Visweswariah, and G. Potamianos, “Speech activity detection fusing acoustic phonetic and energy features,” Proc. Europ. Conf. Speech Comm. Technol. (Interspeech), pp. 241–244, Lisbon, Portugal, 2005.
J. Jiang, G. Potamianos, and G. Iyengar, “Improved face finding in visually challenging environments,” Proc. Int. Conf. Multimedia Expo. (ICME), Amsterdam, The Netherlands, 2005.
D. Macho, J. Padrell, A. Abad, C. Nadeu, J. Hernando, J. McDonough, M. Wolfel, U. Klee, M. Omologo, A. Brutti, P. Svaizer, G. Potamianos, and S.M. Chu, “Automatic speech activity detection, source localization, and speech recognition on the CHIL seminar corpus,” Proc. Int. Conf. Multimedia Expo. (ICME), Amsterdam, The Netherlands, 2005.

2004

J. Huang, G. Potamianos, J. Connell, and C. Neti. “Audio-visual speech recognition using an infrared headset,” Speech Communication, vol. 44, no. 4, pp. 83–96, 2004.
P. Scanlon, G. Potamianos, V. Libal, and S.M. Chu, “Mutual information based visual feature selection for lipreading,” Proc. Int. Conf. Spoken Lang. Process. (ICSLP), Jeju Island, Korea, 2004.
E. Marcheret, S.M. Chu, V. Goel, and G. Potamianos, “Efficient likelihood computation in multi-stream HMM based audio-visual speech recognition,” Proc. Int. Conf. Spoken Lang. Process. (ICSLP), Jeju Island, Korea, 2004.
G. Potamianos, C. Neti, J. Huang, J.H. Connell, S. Chu, V. Libal, E. Marcheret, N. Haas, and J. Jiang, “Towards practical deployment of audio-visual speech recognition,” Invited, Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), vol. 3, pp. 777–780, Montreal, Canada, 2004.
J. Jiang, G. Potamianos, H. Nock, G. Iyengar, and C. Neti, “Improved face and feature finding for audio-visual speech recognition in visually challenging environments,” Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), vol. 5, pp. 873–876, Montreal, Canada, 2004.
S.M. Chu, V. Libal, E. Marcheret, C. Neti, and G. Potamianos, “Multistage information fusion for audio-visual speech recognition,” Proc. Int. Conf. Multimedia Expo. (ICME), Taipei, Taiwan, 2004.
P. de Cuetos, G.R. Iyengar, C.V. Neti, and G. Potamianos, System and Method for Microphone Activation Using Visual Speech Cues, Patent No.: US006754373B1, June 22, 2004.

2003

G. Potamianos, C. Neti, G. Gravier, A. Garg, and A.W. Senior, “Recent advances in the automatic recognition of audio-visual speech,” Invited, Proceedings of the IEEE, vol. 91, no. 9, pp. 1306–1326, 2003.
G. Potamianos, C. Neti, and S. Deligne, “Joint audio-visual speech processing for recognition and enhancement,” Proc. Work. Audio-Visual Speech Process. (AVSP), pp. 95–104, St. Jorioz, France, 2003.
J. Huang, G. Potamianos, and C. Neti, “Improving audio-visual speech recognition with an infrared headset,” Proc. Work. Audio-Visual Speech Process. (AVSP), pp. 175–178, St. Jorioz, France, 2003.
G. Potamianos and C. Neti, “Audio-visual speech recognition in challenging environments,” Proc. Europ. Conf. Speech Comm. Technol. (Eurospeech), pp. 1293–1296, Geneva, Switzerland, 2003.
J.H. Connell, N. Haas, E. Marcheret, C. Neti, G. Potamianos, and S. Velipasalar, “A real-time prototype for small-vocabulary audio-visual ASR,” Proc. Int. Conf. Multimedia Expo. (ICME), vol. II, pp. 469-472, Baltimore, MD, 2003.
U.V. Chaudhari, G.N. Ramaswamy, G. Potamianos, and C. Neti, “Information fusion and decision cascading for audio-visual speaker recognition based on time varying stream reliability prediction,” Proc. Int. Conf. Multimedia Expo. (ICME), pp. 9–12, Baltimore, MD, 2003.
A. Garg, G. Potamianos, C. Neti, and T.S. Huang, “Frame-dependent multi-stream reliability indicators for audio-visual speech recognition,” Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), vol. I, pp. 24–27, Hong Kong, China, 2003.
U.V. Chaudhari, G.N. Ramaswamy, G. Potamianos, and C. Neti, “Audio-visual speaker recognition using time-varying stream reliability prediction,” Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), vol. V, pp. 712–715, Hong Kong, China, 2003.
E. Cosatto, H.P. Graf, G. Potamianos, and J. Schroeter, Audio-Visual Selection Process for the Synthesis of Photo-Realistic Talking-Head Animations, Patent No.: US006654018B1, Nov. 25, 2003.

2002

C. Neti, G. Potamianos, J. Luettin, and E. Vatikiotis-Bateson, “Editorial of the special issue on joint audio-visual speech processing,” EURASIP Journal on Applied Signal Processing, vol. 2002, no. 11, 2002.
S. Deligne, G. Potamianos, and C. Neti, “Audio-visual speech enhancement with AVCDCN (audio-visual codebook dependent cepstral normalization)”, Proc. Int. Conf. Spoken Lang. Process. (ICSLP), vol. 3, pp. 1449–1452, Denver, CO, 2002.
R. Goecke, G. Potamianos, and C. Neti, “Noisy audio feature enhancement using audio-visual speech data,” Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp. 2025–2028, Orlando, FL, 2002.
G. Gravier, S. Axelrod, G. Potamianos, and C. Neti, “Maximum entropy and MCE based HMM stream weight estimation for audio-visual ASR,” Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), pp. 853–856, Orlando, FL, 2002.
G. Gravier, G. Potamianos, and C. Neti, “Asynchrony modeling for audio-visual speech recognition,” Proc. Human Lang. Techn. Conf. (HLT), pp. 1–6, San Diego, CA, 2002.

2001

G. Potamianos, C. Neti, G. Iyengar, A.W. Senior, and A. Verma, “A cascade visual front end for speaker independent automatic speechreading,” International Journal of Speech Technology, Special Issue on Multimedia, vol. 4, pp. 193–208, 2001.
G. Potamianos, C. Neti, G. Iyengar, and E. Helmuth, “Large-vocabulary audio-visual speech recognition by machines and humans,” Proc. Europ. Conf. Speech Comm. Technol. (Eurospeech), pp. 1027–1030, Aalborg, Denmark, 2001.
G. Potamianos and C. Neti, “Automatic speechreading of impaired speech,” Proc. Work. Audio-Visual Speech Process. (AVSP), pp. 177–182, Aalborg, Denmark, 2001.
G. Potamianos and C. Neti, “Improved ROI and within frame discriminant features for lipreading,” Proc. Int. Conf. Image Process. (ICIP), pp. 250–253, Thessaloniki, Greece, 2001.
C. Neti, G. Potamianos, J. Luettin, I. Matthews, H. Glotin, and D. Vergyri, “Large-vocabulary audio-visual speech recognition: A summary of the Johns Hopkins Summer 2000 Workshop,” Proc. IEEE Work. Multimedia Signal Process. (MMSP), pp. 619–624, Cannes, France, 2001.
G. Iyengar, G. Potamianos, C. Neti, T. Faruquie, and A. Verma, “Robust detection of visual ROI for automatic speechreading,” Proc. IEEE Work. Multimedia Signal Process. (MMSP), pp. 79–84, Cannes, France, 2001.
I. Matthews, G. Potamianos, C. Neti, and J. Luettin, “A comparison of model and transform-based visual features for audio-visual LVCSR,” Proc. Int. Conf. Multimedia Expo. (ICME), Tokyo, Japan, 2001.
G. Potamianos, J. Luettin, and C. Neti, “Hierarchical discriminant features for audio-visual LVCSR,” Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), vol. 1, pp. 165–168, Salt Lake City, UT, 2001.
J. Luettin, G. Potamianos, and C. Neti, “Asynchronous stream modeling for large-vocabulary audio-visual speech recognition,” Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), vol. 1, pp. 169–172, Salt Lake City, UT, 2001.
H. Glotin, D. Vergyri, C. Neti, G. Potamianos, and J. Luettin, “Weighting schemes for audio-visual fusion in speech recognition,” Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), vol. 1, pp. 173–176, Salt Lake City, UT, 2001.

2000

G. Potamianos and C. Neti, “Stream confidence estimation for audio-visual speech recognition,” Proc. Int. Conf. Spoken Language Process. (ICSLP), vol. III, pp. 746–749, Beijing, China, 2000.
C. Neti, G. Iyengar, G. Potamianos, A. Senior, and B. Maison, “Perceptual interfaces for information interaction: Joint processing of audio and visual information for human-computer interaction”, Proc. Int. Conf. Spoken Language Process. (ICSLP), vol III, pp. 11-14, Beijing, China, 2000.
G. Potamianos, A. Verma, C. Neti, G. Iyengar, and S. Basu, “A cascade image transform for speaker independent automatic speechreading,” Proc. IEEE Int. Conf. Multimedia Expo. (ICME), vol. II, pp. 1097–1100, New York, NY, 2000.
E. Cosatto, G. Potamianos, and H.P. Graf, “Audio-visual unit selection for the synthesis of photo-realistic talking-heads,” Proc. IEEE Int. Conf. Multimedia Expo. (ICME), vol. II, pp. 619–622, New York, NY, 2000.
E. Cosatto, H.P. Graf, and G. Potamianos, Robust Multi-Modal Method for Recognizing Objects, Patent No.: US006118887A, Sep. 12, 2000.

Past Millennium

G. Potamianos and A. Potamianos, “Speaker adaptation for audio-visual automatic speech recognition,” Proc. Europ.
Speech Comm. Technol. (Eurospeech), Budapest, Hungary, vol. 3, pp. 1291–1294, 1999.
G. Potamianos and F. Jelinek, “A study of n-gram and decision tree letter language modeling methods,” Speech Communication, vol. 24, no. 3, pp. 171–192, 1998.
G. Potamianos and H.P. Graf, “Linear discriminant analysis for speechreading,” Proc. IEEE Work. Multimedia Signal Process. (MMSP), Los Angeles, CA, pp. 221–226, 1998.
G. Potamianos, H.P. Graf, and E. Cosatto, “An image transform approach for HMM based automatic lipreading,” Proc. Int. Conf. Image Process. (ICIP), Chicago, IL, pp. 173–177, 1998.
G. Potamianos and H.P. Graf, “Discriminative training of HMM stream exponents for audio-visual speech recognition,” Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), Seattle, WA, vol. 6, pp. 3733–3736, 1998.
H.P. Graf, E. Cosatto, and G. Potamianos, “Machine vision of faces and facial features,” Proc. R.I.E.C. Int. Symp. Design Archit. Inform. Process. Systems Based Brain Inform. Princ., Sendai, Japan, pp. 48–53, 1998.
G. Potamianos and J. Goutsias, “Stochastic approximation algorithms for partition function estimation of Gibbs random fields,” IEEE Transactions on Information Theory, vol. 43, no. 6, pp. 1948–1965, 1997.
G. Potamianos, E. Cosatto, H.P. Graf, and D.B. Roe, “Speaker independent audio-visual database for bimodal ASR,” Proc. Europ. Tutorial Research Work. Audio-Visual Speech Process. (AVSP), Rhodes, Greece, pp. 65–68, 1997.
H.P. Graf, E. Cosatto, and G. Potamianos, “Robust recognition of faces and facial features with a multi-modal system,” Proc. Int. Conf. Systems Man Cybern. (ICSMC), Orlando, FL, pp. 2034–2039, 1997.
G. Potamianos, “Efficient Monte Carlo estimation of partition function ratios of Markov random field images,” Proc. Conf. Inform. Sci. Systems (CISS), Princeton, NJ, vol. II, pp. 1212–1215, 1996.
G. Potamianos and J. Goutsias, “A unified approach to Monte Carlo likelihood estimation of Gibbs random field images,” Proc. Conf. Inform. Sci. Systems (CISS), Princeton, NJ, vol. I, pp. 84–90, 1994.
G.G. Potamianos and J. Goutsias, “Partition function estimation of Gibbs random field images using Monte Carlo simulations,” IEEE Transactions on Information Theory, vol. 39, no. 4, pp. 1322–1332, 1993.
G. Potamianos and J. Goutsias, “An analysis of Monte Carlo methods for likelihood estimation of Gibbsian images,” Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), Minneapolis, MN, vol. V, pp. 519–522, 1993.
G. Potamianos and J. Goutsias, “On computing the likelihood function of partially observed Markov random field images using Monte Carlo simulations,” Proc. Conf. Inform. Sci. Systems (CISS), Princeton, vol. I, pp. 357–362, 1992.
G. Potamianos and J. Goutsias, “A novel method for computing the partition function of Markov random field images using Monte Carlo simulations,” Proc. Int. Conf. Acoust. Speech Signal Process. (ICASSP), Toronto, Canada, vol. 4, pp. 2325–2328, 1991.
G. Potamianos and J. Diamessis, “Frequency sampling design of 2-D IIR filters using continued fractions,” Proc. Int. Symp. Circuits Systems (ISCAS), New Orleans, LA, pp. 2454–2457, 1990.
J. Diamessis and G. Potamianos, “A novel method for designing IIR filters with nonuniform samples,” Proc. Conf. Inform. Sci. Systems (CISS), Princeton, NJ, vol. 1, pp. 192–195, 1990.
J. Diamessis and G. Potamianos, “Modeling unequally spaced 2-D discrete signals by rational functions,” Proc. Int. Symp. Circuits Systems (ISCAS), Portland, OR, pp. 1508–1511, 1989.

[Last update: 04/04/2024 -- Τελευταία Επικαιροποίηση: 04/04/2024]

Προπτυχιακά Μαθήματα:

Επιβλέπων Καθηγητής	Ποταμιάνος Γεράσιμος, Αναπληρωτής Καθηγητής
Τίτλος Διδακτορικής Διατριβής	Αναγνώριση Χαρακτηριστικών 2Δ/3Δ Αντικειμένων με Καινοτόμες Τεχνολογίες Βαθιάς Μάθησης και Αλληλεπίδρασης
Έτος	2020

Επιβλέπων Καθηγητής	Ποταμιάνος Γεράσιμος, Αναπληρωτής Καθηγητής
Τίτλος Διδακτορικής Διατριβής	Αλγόριθμοι Αυτόματης Αναγνώρισης Νοηματικής Γλώσσας και Ενσωμάτωσή τους σε Εκπαιδευτική Πλατφόρμα Ελληνικής Νοηματικής
Έτος	2024

Επιβλέπων Καθηγητής	Ποταμιάνος Γεράσιμος, Αναπληρωτής Καθηγητής
Θέμα Διατριβής	Οπτικοακουστική Επεξεργασία Φωνής σε μη Περιορισμένα Περιβάλλοντα
Email	alkoumpa@e-ce.uth.gr

Επιβλέπων Καθηγητής	Ποταμιάνος Γεράσιμος, Αναπληρωτής Καθηγητής
Θέμα Διατριβής	Μελέτη Κεραίας Μικροταινίας Κατευθυνόμενης Δέσμης σε Τυχαία Ανώμαλη Επιφάνεια
Email	ntsikrikonis@e-ce.uth.gr

Ερευνητικά Έργα σε Εξέλιξη

Αναλογικός Σχεδιασμός, Δοκιμές και Επαλήθευση

DIGITAfrica: Towards a comprehensive pan-African research infrastructure in Digital Sciences

TWIN-RELECT: Twinning for Excellence in Reliable Electronics

Πρόσφατες Ανακοινώσεις

Ποταμιάνος Γεράσιμος

Δομή Προαπαιτούμενων Μαθημάτων

ECE218 Σήματα και Συστήματα

ECE325 Ψηφιακή Επεξεργασία Σημάτων

ECE334 Αναγνώριση Προτύπων

ECE352 Επεξεργασία Εικόνας

ECE443 Επεξεργασία Φωνής και Ήχου

ECE457 Εισαγωγή στην Όραση Υπολογιστών

ΜΔΕ613 Όραση Υπολογιστών

ΜΔΕ671 Προχωρημένα Θέματα Επεξεργασίας Φωνής και Λόγου

Θερμός Σπυρίδων

Παπαδημητρίου Αικατερίνη

Κουμπαρούλης Αλέξανδρος

Τσικρικώνης Νικόλαος

Πρόσφατες Ανακοινώσεις

e-Yπηρεσίες