Special Event: Non-functional properties – Design exploration: energy, power, reliability

within the National Research Centre in High Performance Computing, Big Data e Quantum Computing, Spoke 1

Project Abstract

Modern HPC systems must be designed considering different parameters, which include cost, performance,
and throughput, as well as non-functional properties, such as power/energy consumption and reliability. The work in this project focuses on advanced design and monitoring techniques for devising energy- and power-efficient reliable parallel architectures based on open standards (e.g., RISC-V) and design space exploration techniques and tools.
Reliability evaluation and enhancement, power and energy monitoring and management, thermal/power modeling, and control techniques are being explored. The performed work also includes programming models and tools for energy efficiency, portability and performance portability, runtime resource management, autotuning mechanisms, and energy-aware hardware-dependent software.

Talks

Talk 1: “Overall architecture and organization of the project”
Presenter: Matteo Sonza Reorda, Politecnico di Torino
Talk 2: “Hardening HPC code against single event upset”
Presenter: William Fornaciari, Politecnico di Milano
Talk 3: “SlackCheck: Verification of temporal constraints in Linux at run-time”
Presenter: Enrico Bini, Università di Torino
Talk 4: “Microprocessors’ Reliability Challenges due to Faults Affecting their Caches”
Presenter: Cecilia Metra, Università di Bologna.

Poster Session

Following the talks, there will be a poster session featuring 9 posters, one from each partner institution. These posters will showcase the current research activities and results related to non-functional properties in HPC systems.

  • Politecnico di Torino: “A Framework for Fine-grain Error Modeling in GPU applications”
  • Politecnico di Milano: “Automatic SIHFT injection”
  • Università di Torino: “SlackCheck: Verification of temporal constraints in Linux at run-time”
  • Università di Pisa: “Efficient Stream Processing on Resource-Constrained Devices”
  • Università di Bologna: “Reliability of memories and power management strategies”
  • Università di Padova: “Optimization of HPC resources for Monte Carlo simulations tuning with machine learning”
  • Università di Napoli: “A recipe book for energy-aware application”
  • CINECA: “Examon”
  • ENEA: “Direct Two-Phase Cooling for exascale HPC systems: experimental testbeds”

Partners

  • Politecnico di Torino
  • Politecnico di Milano
  • Università di Torino
  • Università di Pisa
  • Università di Bologna
  • Università di Padova
  • Università di Napoli
  • CINECA
  • ENEA