Analysis performance has a significant impact on the productivity of physicists. The vast majority of analyses use ROOT (https://root.cern). For a few years now, ROOT has offered an analysis interface called RDataFrame which helps getting the best performance for analyses, ideally making them I/O limited, i.e. with their performance limited by the throughput of reading the input data.
The CERN IT department has recently noted (https://doi.org/10.5281/zenodo.6337728) that for the analysis activities (that they heuristically identified as such) there was no apparent performance CPU nor I/O bottleneck as seen from their point of view. We will report on our investigation in collaboration with USCMS and the CERN IT department to understand better where the inefficiencies that gave rise to this situation come from and the improvements that were made in ROOT to significantly reduce those inefficiencies. We will also describe additional logging and tagging facilities introduced to help distinguish the type of workload and help correlate the information gathered on the server side with the activities carried out by the users’ analysis.
|Consider for long presentation||Yes|