Background and activities
I am involved in teaching the following courses:
- TDT4255 Computer Design
- TDT01 Architecture of Computing Systems
- DT8123 Advanced Computing
I also supervise project and master thesis topics within computer architecture and design. Current project and master thesis topics are available at IDIs web pages. I often co-supervise projects and masters with local and national industry partners such as ARM, Nordic Semiconductor and Silicon Labs (formerly Energy Micro).
The topics and reports of my supervised master theses can be found on NTNU Open.
The main goal of my research is to contribute to designing faster and more energy-efficient computers. More specifically, I investigate how computer hardware can specialize to the current application – to improve efficiency – while retaining sufficient generality to be efficient across diverse applications – to enable reuse. I am affiliated with the NTNU's Computer Architecture Lab (CAL) and currently serve as the deputy head of the Computing research group.
I am currently involved in the following research projects:
- Project Manager and PI of the Balancing Compute and Memory Performance in Reconfigurable Accelerators with Analytical Modeling (BAMPAM) project. BAMPAM is a young research talents project funded by the Norwegian Research Council program IKTPLUSS.
- Scientific advisor in the Boosting Widening Digital Innovation Hubs (BOWI) Horizon 2020 project.
We recently completed the TULIPP Horizon 2020 project, see our recently published book for more details. A key contribution is the STHEM utilities which makes it possible to non-intrusively estabilsh the energy consumption of source code constructs such as procedures and loops [video]. To make full use of STHEM, you need the Lynsyn power measurement unit which can be bought from Sundance.
I currently supervise/mentor the following PhD students and post docs:
- Fatemeh Ghasemi (main supervisor)
- Björn Gottschall (main supervisor)
- Lukas Liedtke (main supervisor)
- Joseph Rogers (main supervisor)
- Truls Asheim (co-supervisor)
- Anders Gaustad (co-supervisor)
- Amund Bergsland Kvalsvik (co-supervisor)
- David Metz (co-supervisor)
I have supervised/metored the following PhD students and post docs:
- Main supervisor for Dr. Nico Reissmann (2012-2019), next employer NTNU IT
- Mentor for Dr. Asbjørn Djupdal (2013-2019), next employer NTNU
- Main supervisor for Dr. Yaman Umuroglu (2012-2018), next employer Xilinx
- Co-supervisor for Dr. Yahya Yassin (2012-2018), next employer Mode Sensors
- Mentor for Dr. Ananya Muddukrishna (2016-2018), next employer ÅF
- Mentor for Dr. Mohammed Sourori (2015-2017), next employer Accenture
- Co-supervisor for Dr. Odd Rune Strømmen Lykkebø (2012-2017), next employer Nnaisense
- Mentor for Post doc. Dr. Juan Manuel Cebrian (2012-2014), next employer UPC/BSC
- Mentor for Post doc. Dr. Nikita Nikitin (2013-2014), next employer Mentor Graphics
- Informal co-supervisor for Dr. Alexandru Ciprian Iordan (2008-2017), next employer ARM
Scientific, academic and artistic work
Displaying a selection of activities. See all publications in the database
- (2022) Delegated Replies: Alleviating Network Clogging in Heterogeneous Architectures. IEEE Symposium on High-Performance Computer Architecture (HPCA).
- (2021) Modeling Periodic Energy-Harvesting Computing Systems. IEEE computer architecture letters. vol. 20 (2).
- (2021) TIP: Time-Proportional Instruction Profiling. MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture.
- (2021) Towards Ubiquitous Low-power Image Processing Platforms. Springer Nature. 2021. ISBN 9783030535315.
- (2020) MDM: The GPU Memory Divergence Model. MICRO'53: Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture.
- (2020) HSM: A Hybrid Slowdown Model for Multitasking GPUs. ASPLOS'20: Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems.
- (2020) Selective Replication in Memory-Side GPU Caches. MICRO'53: Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture.
- (2019) DCMI: A Scalable Strategy for Accelerating Iterative Stencil Loops on FPGAs. ACM Transactions on Architecture and Code Optimization (TACO). vol. 16 (4).
- (2019) Modeling Emerging Memory-Divergent GPU Applications. IEEE computer architecture letters. vol. 18 (2).
- (2018) GDP: Using Dataflow Properties to Accurately Estimate Interference-Free Performance at Runtime. IEEE Symposium on High-Performance Computer Architecture (HPCA). vol. 2018-February.
- (2018) Get Out of the Valley: Power-Efficient Address Mapping for GPUs. International Symposium on Computer Architecture.
- (2017) FINN: A Framework for Fast, Scalable Binarized Neural Network Inference. Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays.
- (2016) Efficient control flow restructuring for GPUs. International Conference on High Performance Computing & Simulation (HPCS).
- (2016) Random access schemes for efficient FPGA SpMV acceleration. Microprocessors and microsystems. vol. 47B.
- (2015) Perfect Reconstructability of Control Flow from Demand Dependence Graphs. ACM Transactions on Architecture and Code Optimization (TACO). vol. 11 (4).
- (2015) ParVec: vectorizing the PARSEC benchmark suite. Computing. vol. 97 (11).
- (2015) Hybrid Breadth-First Search on a Single-Chip FPGA-CPU Heterogeneous Platform. 25th International Conference on Field Programmable Logic and Applications, FPL 2015, London, United Kingdom, September 2-4, 2015.
- (2014) Optimized Hardware for Suboptimal Software: The Case for SIMD-aware Benchmarks. IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE.
- (2014) Graph-based Performance Accounting for Chip Multiprocessor Memory Systems. Proceedings of the 23rd International Conference on Parallel Architectures and Compilation Techniques (PACT).
- (2014) An Energy Efficient Column-Major Backend for FPGA SpMV Accelerators. 2014 32nd IEEE International Conference on Computer Design (ICCD).
- (2010) Multi-level Hardware Prefetching Using Low Complexity Delta Correlating Prediction Tables with Partial Matching. Lecture Notes in Computer Science (LNCS). vol. 5 (1).
- (2010) DIEF: An Accurate Interference Feedback Mechanism for Chip Multiprocessor Memory Systems. Lecture Notes in Computer Science (LNCS).
- (2009) A Quantitative Study of Memory System Interference in Chip Multiprocessor Architectures. 11th IEEE International Conference on High Performance Computing and Communications (HPCC 2009).
- (2009) A Light-Weight Fairness Mechanism for Chip Multiprocessor Memory Systems. Proceedings of the 6th ACM Conference on Computing Frontiers.