ECE415 High Performance Computing Systems

Scientific Responsible	Stamoulis Georgios, Professor E-mail: georges@uth.gr
Title	Hellenic Chips Competence Centre
Funding Agency	Chips Joint Undertaking
Budget	326.350,00
Duration	01/06/2025 – 31/05/2029

Scientific Responsible	Plessas Fotios, Professor E-mail: fplessas@uth.gr
Title	Αναλογικός Σχεδιασμός, Δοκιμές και Επαλήθευση
Funding Agency	NanoZeta Technologies ltd.
Budget	271.400,00
Duration	26/01/2021 – 25/01/2028

Scientific Responsible	Korakis Athanasios, Professor E-mail: korakis@uth.gr
Title	DIGITAfrica: Towards a comprehensive pan-African research infrastructure in Digital Sciences
Funding Agency	ΕΥΡΩΠΑΪΚΗ ΕΝΩΣΗ
Budget	123.125,00
Duration	16/12/2024 – 31/12/2027

Department of Electrical and Computer Engineering
Sekeri – Cheiden Str Pedion Areos, ECE Building 383 34 Volos – Greece
Tel.	+30 24210 74967, +30 24210 74934
e-mail	gece ΑΤ uth.gr
PGS Tel.	+30 24210 74933
PGS e-mail	pgsec ΑΤ uth.gr
URL	https://www.e-ce.uth.gr/contact-info/?lang=en

Subject Area	Software and Information System Engineering
Semester	Semester 7 – Fall
Type	Elective
Teaching Hours	4
ECTS	6
Prerequisites	ECE318 Operating Systems
Course Site	https://eclass.uth.gr/courses/E-CE_U_180/
Course Director	Christos Antonopoulos, Professor E-mail: cda@uth.gr
Course Instructor	Christos Antonopoulos, Professor E-mail: cda@uth.gr

Description
Learning Outcomes

The course discusses programming techniques for parallel systems and more specifically multicores and manycores. It spans programming of conventional and non-conventional, homegeneous and heterogeneous architectures. The students are introduced to performance measurement and estimation techniques, application profiling, experimental performance evaluation, experimental evaluation of software / hardware interaction and optimization techniques.

The course is complemented by a series of homeworks that allow the students to apply in practice the methods and techniques discussed in class.

The main course topics are the following:

Introduction, technical and economic reasons that lead to the de-facto prevalence of multicore systems, grand-challenge applications
Main metrics, Amdahl’s law, Karp-Flatt metric, Gustafson-Barsis law.
Elements of parallel computer architectures, parallel systems taxonomies, typical conventional and non-conventional architectures.
Methodologies for the experimental evaluation of the performance of parallel applications on multicore systems and of their interaction with hardware.
Patterns in parallel computing: parallelism extraction, algorithmic structure, data structures, implementation mechanisms.
Programming models for multi- and many-core systems (OpenMP, Intel Thread Building Blocks, Cilk, OpenCL).
GPU programming. The CUDA programming model.
Software interaction with the underlying memory architecture, effective use of caches, data prefetching, communication/computation overlap. The CUDA memory model.
Perfprmance optimization on GPUs – Floating point issues, accuracy, accuracy/performance tradeoff.
CUDA applications case studies: MRI reconstruction, molecular visualization and analysis.
Synchronization implementation techniques (locks, barriers) and their interraction with hardware, alternative synchronization methods (fine-grained, speculative, lazy, non-blocking), transactional memory.
Performance optimization techniques on a single core (branches, efficient use of the cache hierarchy, loop manipulation, slow instructions and lookup tables).
Vectorization techniques, data alignment, automatic vectorization.

After successfully fulfilling the requirements of the course, students are capable of:

Knowing the main parallel computing architectures..
Knowing the basic steps required to develop parallel software and applying them on real codes.
Understanding the interaction of software with the underlying hardware and applying it to better map code on the underlying architecture..
Knowing the architecture of basic services used by parallel codes (for examples different synchronization methods) and choosing the best algorithm / implementation according to the characteristics of his / her code and of the underlying architecture.
Developing code for conventional and non conventional multi- and many-core acrchitectures, using the appropriate programming model for each case.
Analyzing code performance using the respective tools and exploiting the results of the analysis to optimize the code.
Quantifying code performance both at a macroscopic (execution time), as well as at a lower level (interaction with hardware), using scientifically sound methodologies and documenting his / her observations in a technical report.

Active Research Projects

Hellenic Chips Competence Centre

Αναλογικός Σχεδιασμός, Δοκιμές και Επαλήθευση

DIGITAfrica: Towards a comprehensive pan-African research infrastructure in Digital Sciences

e-Yπηρεσίες