List of Speakers

Four international and two local speakers will present their latest research.

 

Cristina Silvano

Politecnico di Milano

Title
Design Space and Application Autotuning for Runtime Adaptivity in Multicore Architectures

Abstract
Given the increasing complexity of manycore architectures, a wide range of architecture parameters must be tuned at design-time to find the best tradeoffs in terms of multiple metrics such as energy and delay. Given the huge design space of manycore architectures, automatic design space exploration is necessary to systematically support at design-time the exploration and the comparison of the design alternatives in terms of multiple competing objectives. At runtime, manycore architectures offer a set of resources that can be assigned and managed dynamically to get a specified Quality of Service. Applications can expose to the runtime a set of software knobs (including application parameters, code transformations, and code variants) to trade-off Quality of Results and Throughput. Resource management and application autotuning are key issues for enabling computing systems to operate close to optimal efficiency by adjusting their behavior in the face of changing conditions, operating environments, usage contexts and resource availability while meeting the requirements on energy-efficiency and Quality-of-Service.

This talk will present multi-objective DSE techniques for many-core architectures. The key techniques include a set of sampling and optimization techniques for finding Pareto points and Design of Experiment techniques to identify the experimentation plan. Machine learning techniques can be used to obtain a prediction of the system behavior based on the set of training data generated by DoE. This talk also presents an application autotuning framework to tune the software knobs in an adaptive multi-application scenario. To support this scenario, where different applications are running concurrently on the same platform, the system resources should be assigned and managed efficiently to the active applications. The approach exploits the concept of orthogonality between application autotuning and runtime management of system resources to support multiple adaptive applications. Overall, the main challenge is to exploit design-time and run-time concepts to lead to an effective way of “self-aware” computing.

Bio
Cristina Silvano is an Associate Professor (with tenure) of Computer Engineering at the Politecnico di Milano. She received her MS degree (Laurea) in Electrical Engineering from Politecnico di Milano in 1987. From 1987 to 1996, she was Senior Design Engineer at the R&D Labs of Group Bull in Pregnana Milanese (Italy) and Visiting Engineer at Bull R&D Labs in Billerica (US) (1988-89) and at IBM Somerset Design Center, Austin (US) (1993-1994). She received her Ph.D. in Computer Engineering from the University of Brescia in 1999.

She was Assistant Professor of Computer Science at the University of Milano (2000 -2002) and then Associate Professor at the Politecnico di Milano (2002-present). Her primary research interests focus on computer architectures and electronic design automation, with particular emphasis on power-aware design for embedded systems, design space exploration and runtime resource management for manycore architectures. Her research has been funded by several national and international projects. In particular, she was Principal Investigator of some industrial funded research projects in collaboration with STMicroelectronics.

She is currently Project Coordinator for the H2020-FET-HPC ANTAREX European project on autotuning and adaptivity for energy-efficient Exascale High Performance Computing systems. She has published more than 140 papers in premier international journals and conferences. She was co-editor of two scientific books edited by Springer in 2010 and 2011. In 2017, she has been named IEEE Fellow for her contributions to "energy-efficient computer architectures".


Alexandra Jimborean

Uppsala University

Title
Decoupled access-execute: Pioneering compilation for energy-efficiency

Abstract
Energy efficiency is essential for performance, to prevent thermal hazards, and to enable more simultaneously active cores. This talk presents software decoupled access-execute (DAE), a compilation technique we pioneered for improving energy efficiency. The compiler decouples the code into coarse-grain memory-bound and compute-bound phases, to re-enable hardware capabilities for energy management, to control data communication vs. data processing and to enhance memory- and instruction-level-parallelism. The end result is 25% energy savings, on average, for memory-bound applications, with negligible impact on performance.

Bio 
Alexandra Jimborean is Assistant Professor at Uppsala University. Her main research interests are compile-time and run-time code analysis and optimization, and software-hardware co-designs for performance and energy-efficiency. She holds a Ph.D. from University of Strasbourg (France) for her work on automatic speculative parallelization.
Alexandra has received over 25 distinctions, awards, and grants, most notably the Google Anita Borg Memorial Award for excellence in academia.


Filippo Mantovani

Barcelona Supercomputer Center (BSC)

Title
Power monitoring on ARM-based HPC clusters: experiences from young and old

Abstract
Since 2011 the EU Mont-Blanc project pushes the development of ARM-based compute platforms following the vision of leveraging the fast growing market of mobile technology for performing scientific computation. The process started almost 6 years ago with the development of prototypes based on Android dev-kits and it is now evolving beyond the research project, towards commercial computational platforms based not only on mobile SoCs, but also on server and HPC technology.
In this talk I will introduce the experience gained developing prototypes based on ARM technology within the Mont-Blanc project. Special attention will be given to the power monitoring infrastructures deployed on ARMv7 and ARMv8 clusters, looking at power consumption data as the first step towards energy efficiency on existing systems.
The goal of the talk is to give a panoramic view of ARM based scientific computing, supported by experience, lesson learned and test results.

Bio
Filippo Mantovani is a postdoctoral research associate of the Mobile and embedded-based HPC group at the Barcelona Supercomputing Center (BSC). He graduated in mathematics and holds a PhD in Computer Science from University of Ferrara, Italy. He has been a scientific associate at the DESY laboratory in Zeuthen, Germany, and at the University of Regensburg, Germany. He spent most of his scientific career in computational physics, computer architecture and high-performance computing, contributing to the Janus, QPACE and QPACE2 projects. He joined BSC’s Mont-Blanc project in 2013, becoming in 2014 principal investigator of the project.


Ana Lucia Varbanescu

University of Amsterdam

Title
Heterogeneous computing with accelerators

Abstract
Accelerators (like GPUs or FPGAs) have been in the center of attention for their performance and energy efficiency promise. Yet heterogeneous computing - using *both* the host and the accelerator(s) to improve performance and energy efficiency of a given application is studied much less. In this talk, we present the advantages and challenges of heterogeneous computing, and sketch the landscape of the tools and applications that can benefit from it. Finally, we provide a few compelling case-studies combining applications, methods, and tools that can be used to gain performance and/or energy efficiency using heterogeneous computing.

Bio
Ana Lucia Varbanescu has graduated from Politehnica University Bucharest. Romania. In 2010 she has obtained her PHD from TUDelft, the Netherlands. She is currently a McGillavry Fellow and Assistant Professor at University of Amsterdam. She has been a visiting researcher at IBM TJ Watson, NVIDIA, Barcelona Supercomputing Center, and Imperial College London. Her research interests are in modern HPC applications and platforms, with a special focus on heterogeneous computing and irregular applications.


Frank Alexander Kraemer

Norwegian University of Science and Technology, NTNU

Title
Autonomous Adaptive Sensing for Energy-Efficient IoT Applications

Abstract
Wireless sensor devices for IoT applications are often configured in a very simple way, residing for instance to uniform sampling frequencies and fixed duty cycles. This, however, can be problematic since the environments of sensor nodes are usually heterogeneous and non-stationary. Manually optimizing sensor devices is not possible due to the scale of the systems, which leads many developers to over-dimensioning the system or accepting non-optimal behavior. In the ART project, we explore instead how sensor devices can utilize various machine learning techniques to optimize their behavior over time. This includes to predict their environment and how it affects, for instance, energy harvesting opportunities. It also includes to decide when acquired data is actually of value for the application, i.e., if the sensed data also provides new information. In this talk, I am going to present the problem domain of autonomous adaptive sensing and show some preliminary results from our sensor test bed.

Bio
Frank Alexander Kraemer has a Dipl.-Ing. degree in Electrical Engineering and an M.Sc. in Information Technology from the University of Stuttgart, Germany. He received his Ph.D. about model-driven development of systems from the Department of Telematics at NTNU, Norwegian University of Science and Technology, in 2008. He is now Associate Professor at the Department of Information Security and Communication Technology at NTNU. His current research interests include architecture and development of Internet of Things applications, as well as adaptive and autonomic sensor systems and their application in various domains. He is currently the coordinator of the NTNU Internet of Things lab.


Yaman Umuroglu

Norwegian University of Science and Technology, NTNU

Title
FINN: A Framework for Fast, Scalable Binarized Neural Network Inference

Abstract
Research has shown that convolutional neural networks contain significant redundancy, and high classification accuracy can be obtained even when weights and activations are reduced from floating point to binary values. These Binarized Neural Networks (BNNs) are particularly well suited to reconfigurable logic devices, which contain an abundance of fine-grained compute resources and can result in smaller, lower power implementations, or conversely in higher classification rates. This talk will present FINN, a framework for building fast and flexible FPGA accelerators for BNNs using a flexible heterogeneous streaming architecture.  FINN-generated accelerators can perform accurate image classification at unprecedented speeds. Key results include classification of handwritten digits with 95.8% accuracy at 12.3 million frames per second and classification of CIFAR-10 images with 80.1% accuracy at 22 thousand frames per second, both on an embedded FPGA platform drawing less than 25 W total system power. Preliminary results on deep neural networks using two- and three-bit quantization on the ImageNet dataset and how to deploy these networks on commodity processors will also be presented.

Bio
Yaman Umuroglu is a fourth year Ph.D. student supervised by Assoc. Prof. Magnus Jahre at the Norwegian University of Science and Technology, and a visiting researcher at Xilinx Research Labs in Dublin, Ireland. He holds a joint European MSc degree on embedded systems from the EMECS programme, and a BSc in Computer Engineering from Middle East Technical University. His current research focus is on using reconfigurable logic for accelerating quantized deep learning and efficient execution of large, sparse problems.