ParVec: Vectorized PARSEC Benchmarks
In computer science research, the standard method to conduct scientific experiments is benchmarking. Researchers decide on a selection of benchmarks (which are a representation of applications of interest) that are then studied in detail to achieve a generalized conclusion that can be then applied to real computer systems. It is critical that the selected workloads are general enough to cover a wide range of software applications, otherwise obtained results will only be of very limited validity.
As technologies move forward benchmarks should be extended to cover new hardware features, or else researchers may end up over or under estimating the impact of their contributions. Vectorization capabilities are available in processors designed for different market segments, from embedded devices (NEON), desktop and servers (SSE and AVX) to GPUs and accelerators. However, many of the benchmark suites frequently used by the research community (SPEC, SPLASH-2, PARSEC, Rodinia, Parboil, etc) have none/very limited SIMD capabilities. ParVec aims to close this gap extending most of the PARSEC benchmarks with SIMD capabilities.
The above figure shows the procedure we used to vectorize the benchmarks. First, we profile each benchmark to detect regions of the code that are suitable for vectorization. Second, we vectorize these regions using our wrapper and math libraries extending them if necessary. In this step, the benchmarks are re-written using intrinsic-like macros that will be translated to real intrinsics if supported by the architecture or to "inline" functions that implement that specific functionality. Finally, we verify the correctness of the application.
If you use ParVec for your research, please cite our ISPASS 2014 paper.
- "Optimized Hardware for Suboptimal Software: The Case for SIMD-aware Benchmarks"; Cebrian, Jahre and Natvig; ISPASS; 2014
- ParVec Quick Start Guide (pdf)
- Profiling Information for Intel Processors (pdf)