Indico is back online after maintenance on Tuesday, April 30, 2024.
Please visit Jefferson Lab Event Policies and Guidance before planning your next event: https://www.jlab.org/conference_planning.

May 8 – 12, 2023
Norfolk Waterside Marriott
US/Eastern timezone

Vectorization in CMSSW applications

Not scheduled
1h
Hampton Roads Ballroom and Foyer Area (Norfolk Waterside Marriott)

Hampton Roads Ballroom and Foyer Area

Norfolk Waterside Marriott

235 East Main Street Norfolk, VA 23510
Poster Poster Poster Session

Speaker

Gartung, Patrick (Fermilab)

Description

The CMS experiment has been utilizing vectorization, or SIMD, in parts of its data processing applications for over a decade. On x86 platforms the vectorization level is still SSE3. In the past attempts to use wider vector instruction sets such as AVX or AVX-512 have, in practice, not resulted in improvements in the overall event processing throughput, because the CPUs scale down their frequency when processing AVX instructions. In addition, a notable part of the global pool of CMS’ resources has been old systems either not supporting AVX, or where the CPU frequency downscaling impacts all cores of the CPU. CMS has nevertheless continued to vectorize more of its application code, and in this work we review profiling methods we have found effective to find out pieces of code that would benefit from vectorization, and techniques to transform those codes such that the GCC compiler is able to auto-vectorize those codes. The build system used for CMSSW, Scram, has also been enhanced to be able to build code for multiple CPU microarchitectures such that the shared libraries of desired microarchitecture level can be loaded based on the CPU of the system. This multi-microarchitecture setup is invisible to the workflow management system, which makes its deployment straightforward. We describe in detail how this multi-microarchitecture build is set up, and measure the impact of using wider vector units than SSE3 on the event processing throughput of CMS applications such as simulation and reconstruction on recent x86 CPUs.

Consider for long presentation Yes

Primary author

Presentation materials

Peer reviewing

Paper