In today's Nuclear Physics (NP), the exploration of the origin, evolution, and structure of the universe's matter is pursued through a broad research program at various collaborative scales, ranging from small groups to large experiments comparable in size to those in high-energy physics (HEP). Consequently, software and computing efforts vary from DIY approaches among a few researchers to...
Next generation High-Energy Physics (HEP) experiments are presented with significant computational challenges, both in terms of data volume and processing power. Using compute accelerators, such as GPUs, is one of the promising ways to provide the necessary computational power to meet the challenge. The current programming models for compute accelerators often involve using...
The dCache project provides open-source software deployed internationally to satisfy
ever more demanding storage requirements. Its multifaceted approach provides an integrated
way of supporting different use-cases with the same storage, from high throughput data
ingest, data sharing over wide area networks, efficient access from HPC clusters and long
term data persistence on a tertiary...
Machine learning has become one of the important tools for High Energy Physics analysis. As the size of the dataset increases at the Large Hadron Collider (LHC), and at the same time the search spaces become bigger and bigger in order to exploit the physics potentials, more and more computing resources are required for processing these machine learning tasks. In addition, complex advanced...
Since 1984 the Italian groups of the Istituto Nazionale di Fisica Nucleare (INFN) and Italian Universities, collaborating with the
DOE laboratory of Fermilab (US) have been running a two-month summer training program for Italian university students. While
in the first year the program involved only four physics students of the University of Pisa, in the following years it was extended
to...
The Julia programming language was created 10 years ago and is now a mature and stable language with a large ecosystem including more than 8,000 third-party packages. It was designed for scientific programming to be a high-level and dynamic language as Python is, while achieving runtime performances comparable to C/C++ or even faster. With this, we ask ourselves if the Julia language and its...
The reconstruction of particle trajectories is a key challenge of particle physics experiments, as it directly impacts particle identification and physics performances while also representing one of the main CPU consumers of many high energy physics experiments. As the luminosity of particle collider increases, this reconstruction will become more challenging and resource intensive. New...
Hadronization is an important step in Monte Carlo event generators, where quarks and gluons are bound into physically observable hadrons. Today’s generators rely on finely-tuned empirical models, such as the Lund string model; while these models have been quite successful overall, there remain phenomenological areas where they do not match data well. In this talk, we present MLHad, a...
Managing a secure software environment is essential to a trustworthy cyberinfrastructure. Software supply chain attacks may be a top concern for IT departments, but they are also an aspect of scientific computing. The threat to scientific reputation caused by problematic software can be just as dangerous as an environment contaminated with malware. The issue of managing environments affects...
We present a new implementation of simulation-based inference using data collected by the ATLAS experiment at the LHC. The method relies on large ensembles of deep neural networks to approximate the exact likelihood. Additional neural networks are introduced to model systematic uncertainties in the measurement. Training of the large number of deep neural networks is automated using a...
Providing computing training to the next generation of physicists is the
principal driver for a biannual multi-day workshop hosted by the DUNE
Computing Consortium. Materials are cast in the Software Carpentries
templates, and to date topics have included storage space, data
management, LArSoft, grid job submission and monitoring. Moreover,
experts provide extended breakout sessions to...
XRootD implemented a client-side erasure coding (EC) algorithm utilizing the Intel Intelligent Storage Acceleration Library. At SLAC, a prototype of XRootD EC storage was set up for evaluation. The architecture and configuration of the prototype is almost identical to that of a traditional non-EC XRootD storage behind a firewall: a backend XRootD storage cluster in its simplest form, and an...
INFN has been running for more than 20 years a distributed infrastructure (the Tier-1 at Bologna-CNAF and 9 Tier-2 centers) which currently offers about 140000 CPU cores, 120 PB of enterprise-level disk space and 100 PB of tape storage, serving more than 40 international scientific collaborations.
This Grid-based infrastructure was augmented in 2019 with the INFN Cloud: a production quality...
With the growing datasets of current and next-generation High-Energy and Nuclear Physics (HEP/NP) experiments, statistical analysis has become more computationally demanding. These increasing demands elicit improvements and modernizations in existing statistical analysis software. One way to address these issues is to improve parameter estimation performance and numeric stability using...
New particle/nuclear physics experiments require a massive amount of computing power that is only achieved by using high performance clusters directly connected to the data acquisition systems and integrated into the online systems of the experiments. However, integrating an HPC cluster into the online system of an experiment means: Managing and synchronizing thousands of processes that handle...
The evaluation of new computing languages for a large community, like HEP, involves comparison of many aspects of the languages' behaviour, ecosystem and interactions with other languages. In this paper we compare a number of languages using a common, yet non-trivial, HEP algorithm: the tiled $N^2$ clustering algorithm used for jet finding. We compare specifically the algorithm implemented in...
The calculation of particle interaction squared amplitudes is a key step in the calculation of cross sections in high-energy physics. These lengthy calculations are currently done using domain-specific symbolic algebra tools, where the time required for the calculations grows rapidly with the number of final state particles involved. While machine learning has proven to be highly successful in...
The common form of inter-institute particle physics experiment collaborations generates unique needs for member management including paper authorship, shift assignments, subscription to mailing lists and access to 3rd party applications such as Github and Slack. For smaller collaborations, typically no facility for centralized member management is available and these needs are usually manually...
PUNCH4NFDI, funded by the Germany Research Foundation initially for five years, is a diverse consortium of particle, astro-, astroparticle, hadron and nuclear physics embedded in the National Research Data Infrastructure initiative.
In order to provide seamless and federated access to the huge variaty of compute and storage systems provided by the participating communities covering their...
Predicting the performance of various infrastructure design options in complex federated infrastructures with computing sites distributed over a wide area that support a plethora of users and workflows, such as the Worldwide LHC Computing Grid (WLCG), is not trivial. Due to the complexity and size of these infrastructures, it is not feasible to deploy experimental test-beds at large scales...
The recent advances in Machine Learning and high-dimensional gradient-based optimization has led to increased interest in the question of whether we can use such methods to optimize the design of future detectors for high-level physics objectives. However this program faces a fundamental obstacle: The quality of a detector design must be judged on the physics inference it enables, but both...
RED-SEA (https://redsea-project.eu/) is a European project funded in the framework of the H2020-JTI-EuroHPC-2019-1 call that started in April 2021. The goal of the project is to evaluate the architectural design of the main elements of the interconnection networks for the next generation of HPC systems supporting hundreds of thousands of computing nodes enabling the Exa-scale for HPC, HPDA and...
INFN-CNAF is one of the Worldwide LHC Computing Grid (WLCG) Tier-1 data centers, providing support in terms of computing, networking, storage resources and services also to a wide variety of scientific collaborations, ranging from physics to bioinformatics and industrial engineering.
Recently, several collaborations working with our data center have developed computing and data management...
With an increased dataset obtained during the Run-3 of the LHC at CERN and the even larger expected increase of the dataset by more than one order of magnitude for the HL-LHC, the ATLAS experiment is reaching the limits of the current data processing model in terms of traditional CPU resources based on x86_64 architectures and an extensive program for software upgrades towards the HL-LHC has...
Despite recent advances in optimising the track reconstruction problem for high particle multiplicities in high energy physics experiments, it remains one of the most demanding reconstruction steps in regards to complexity and computing ressources. Several attemps have been made in the past to deploy suitable algorithms for track reconstruction on hardware accelerators, often by tailoring the...
The Storage Group in the CERN IT Department operates several Ceph storage clusters with an overall capacity exceeding 100 PB. Ceph is a crucial component of the infrastructure delivering IT services to all the users of the Organization as it provides: i) Block storage for the OpenStack infrastructure, ii) CephFS used as persistent storage by containers (OpenShift and Kubernetes) and as shared...
The high luminosity expected from the LHC during the Run 3 and, especially, the HL-LHC of data taking introduces significant challenges in the CMS event reconstruction chain. The additional computational resources needed to treat this increased quantity of data surpass the expected increase in processing power for the next years. In order to fit the projected resource envelope, CMS is...
ICSC is one of the five Italian National Centres created in the framework of the Next Generation EU funding by the European Commission. The aim of ICSC, designed and approved through 2022 and eventually started in September 2022, is to create the national digital infrastructure for research and innovation, leveraging exixting HPC, HTC and Big Data infrastructures evolving towards a cloud...
We present a Multi-Module framework based on Conditional Variational Autoencoder (CVAE) to detect anomalies in the High Voltage Converter Modulators (HVCMs) which have historically been a cause of major down time for the Spallation Neutron Source (SNS) facility. Previous studies using machine learning techniques were to predict faults ahead of time in the SNS accelerator using a Single...
High Energy Physics software has been a victim of the necessity to choose one implementation language as no really usable multi-language environment existed. Even a co-existence of two languages in the same framework (typically C++ and Python) imposes a heavy burden on the system. The role of different languages was generally limited to well encapsulated domains (like Web applications,...
Nowadays Machine Learning (ML) techniques are successfully used in many areas of High-Energy Physics (HEP) and will play a significant role also in the upcoming High-Luminosity LHC upgrade foreseen at CERN, when a huge amount of data will be produced by LHC and collected by the experiments, facing challenges at the exascale. To favor the usage of ML in HEP analyses, it would be useful to have...
We will discuss the training and on-boarding initiatives currently adopted by a range of High Energy Physics (HEP) experiments. On-boarding refers to the process by which new members of a collaboration gain the knowledge and skills needed to become effective members. Fast and efficient on-boarding is increasingly important for HEP experiments as physics analyses and, as a consequence, the...
Collider physics analyses have historically favored Frequentist statistical methodologies, with some exceptions of Bayesian inference in LHC analyses through use of the Bayesian Analysis Toolkit (BAT). We demonstrate work towards an approach for performing Bayesian inference for LHC physics analyses that builds upon the existing APIs and model building technology of the pyhf and PyMC Python...
Building successful multi-national collaborations is challenging. The scientific communities in a range of physical sciences have been learning how to build collaborations that build upon regional capabilities and interests over decades, iteratively with each new generation of large scientific facilities required to advance their scientific knowledge. Much of this effort has naturally focused...
The OSG-operated Open Science Pool is an HTCondor-based virtual cluster that aggregates resources from compute clusters provided by several organizations. A user can submit batch jobs to the OSG-maintained scheduler, and they will eventually run on a combination of supported compute clusters without any further user action. Most of the resources are not owned by, or even dedicated to OSG, so...
Data access at the UK Tier-1 facility at RAL is provided through its ECHO storage, serving the requirements for the WLGC and increasing numbers of other HEP and astronomy related communities.
ECHO is a Ceph-backed erasure-coded object store, currently providing in excess of 40PB of usable space, with frontend access to data provided via XRootD or gridFTP, using the libradosstriper library of...
The IceCube Neutrino Observatory is a cubic kilometer neutrino telescope located at the geographic South Pole. To accurately and promptly reconstruct the arrival direction of candidate neutrino events for Multi-Messenger Astrophysics use cases, IceCube employs Skymap Scanner workflows managed by the SkyDriver service. The Skymap Scanner performs maximum-likelihood tests on individual pixels...
The upcoming exascale computers in the United States and elsewhere will have diverse node architectures, with or without compute accelerators, making it a challenge to maintain a code base that is performance portable across different systems. As part of the US Exascale Computing Project (ECP), the USQCD collaboration has embarked on a collaborative effort to prepare the lattice QCD software...
Significant advances in utilizing deep learning for anomaly detection have been made in recent years. However, these methods largely assume the existence of a normal training set (i.e., uncontaminated by anomalies), or even a completely labeled training set. In many complex engineering systems, such as particle accelerators, labels are sparse and expensive; in order to perform anomaly...
Software and computing are an integral part of our research. According to the survey for the “Future Trends in Nuclear Physics Computing” workshop in September 2020, students and postdocs spent 80% of their time on the software and computing aspects of your research. For the Electron-Ion Collider, we are looking for ways to make software (and computing) "easier" to use. All scientists of all...
Many current analyses in nuclear and particle physics are in search for signals that are encompassed by irreducible background events. These background events, entirely surrounding a signal of interest, would lead to inaccurate results when extracting physical observables from the data, due to the inability to reduce the signal to background ratio using any type of selection criteria. By...
EOS has been the main storage system at CERN for more than a decade, continuously improving in order to meet the ever evolving requirements of the LHC experiments and the whole physics user community. In order to satisfy the demands of LHC Run-3, in terms of storage performance and tradeoff between cost and capacity, EOS was enhanced with a set of new functionalities and features that we will...
The production of simulated datasets for use by physics analyses consumes a large fraction of ATLAS computing resources, a problem that will only get worse as increases in the instantaneous luminosity provided by the LHC lead to more collisions per bunch crossing (pile-up). One of the more resource-intensive steps in the Monte Carlo production is reconstructing the tracks in the ATLAS Inner...
The fast algorithms for data reconstruction and analysis of the FLES (First Level Event Selection) package of the CBM (FAIR/GSI) experiment were successfully adapted to work on the High Level Trigger (HLT) of the STAR (BNL) experiment online. For this purpose, a so-called express data stream was created on the HLT, which enabled full processing and analysis of the experimental data in real...
The MoEDAL experiment at CERN (https://home.cern/science/experiments/moedal-mapp) carries out searches for highly ionising exotic particles such as magnetic monopoles. One of the technologies deployed in this task is the Nuclear Track Detector (NTD). In the form of plastic films, these are passive detectors that are low cost and easy to handle. After exposure to the LHC collision environment...
Opticks is an open source project that accelerates optical photon simulation by
integrating NVIDIA GPU ray tracing, accessed via the NVIDIA OptiX 7 API, with
Geant4 toolkit based simulations. A single NVIDIA Turing architecture GPU has
been measured to provide optical photon simulation speedup factors exceeding
1500 times single threaded Geant4 with a full JUNO analytic GPU...
The Italian WLCG Tier-1 located in Bologna and managed by INFN-CNAF has a long tradition in supporting several research communities in the fields of High-Energy Physics, Astroparticle Physics, Gravitational Waves, Nuclear Physics and others, to which provides computing resources in the form of batch computing, both HPC, HTC and Cloud, and storage. Although the LHC experiments at CERN represent...
The HSF/IRIS-HEP Software Training group provides software training skills to new researchers in High Energy Physics (HEP) and related communities. These skills are essential to produce high-quality and sustainable software needed to do the research. Given the thousands of users in the community, sustainability, though challenging, is the centerpiece of its approach. The training modules are...
The Large Hadron Collider (LHC) experiments distribute data by leveraging a diverse array of National Research and Education Networks (NRENs), where experiment data management systems treat networks as a “blackbox” resource. After the High Luminosity upgrade, the Compact Muon Solenoid (CMS) experiment alone will produce roughly 0.5 exabytes of data per year. NREN Networks are a critical part...
CaTS is a Geant4 advanced example that is part of Geant4[1] since version 11.0. It demonstrates the use of Opticks[2] to offload the simulation of optical photons to GPUs. Opticks interfaces with the Geant4 toolkit to collect all the necessary information to generate and trace optical photons, re-implements the optical physics processes to be run on the GPU, and automatically translates the...
The CMS collaboration has chosen a novel high granularity calorimeter (HGCAL) for the endcap regions as part of its planned upgrade for the high luminosity LHC. The calorimeter will have fine segmentation in both the transverse and longitudinal directions and will be the first such calorimeter specifically optimised for particle flow reconstruction to operate at a colliding-beam experiment....
Abstract
We present results on Deep Learning applied to Amplitude and Partial Wave Analysis (PWA) for spectroscopic analyses. Experiments in spectroscopy often aim to observe strongly-interacting, short-lived particles that decay to multi-particle final states. These particle decays have angular distributions that our deep learning model has been trained to identify. Working with TensorFlow...
IDEA (Innovative Detector for an Electron-positron Accelerator) is an innovative general-purpose detector concept, designed to study electron-positron collisions at future e$^+$e$^-$ circular colliders (FCC-ee and CEPC).
The detector will be equipped with a dual read-out calorimeter able to measure separately the hadronic component and the electromagnetic component of the showers initiated...
The Vera C. Rubin observatory is preparing for the execution of the most ambitious astronomical survey ever attempted, the Legacy Survey of Space and Time (LSST). Currently, in its final phase of construction in the Andes mountains in Chile and due to start operations in late 2024 for 10 years, its 8.4-meter telescope will nightly scan the southern sky and collect images of the entire visible...
The Jiangmen Underground Neutrino Observatory (JUNO) is a multipurpose neutrino experiment and the determination of the neutrino mass hierarchy is its primary physics goal. JUNO is going to take data in 2024 with 2PB raw data each year and use distributed computing infrastructure for simulation, reconstruction and analysis tasks. The JUNO distributed computing system has been built up based on...
Machine learning (ML) has become an integral component of high energy physics data analyses and is likely to continue to grow in prevalence. Physicists are incorporating ML into many aspects of analysis, from using boosted decision trees to classify particle jets to using unsupervised learning to search for physics beyond the Standard Model. Since ML methods have become so widespread in...
We present an NDN-based Open Storage System (OSS) plugin for XRootD instrumented with an accelerated packet forwarder, built for data access in the CMS and other experiments at the LHC, together with its current status, performance as compared to other tools and applications, and plans for ongoing developments.
Named Data Networking (NDN) is a leading Future Internet Architecture where data...
Track reconstruction, also known as tracking, is a vital part of the HEP event reconstruction process, and one of the largest consumers of computing resources. The upcoming HL-LHC upgrade will exacerbate the need for efficient software able to make good use of the underlying heterogeneous hardware. However, this evolution should not imply the production of code unintelligible to most of its...
Fast, efficient and accurate triggers are a critical requirement for modern high energy physics experiments given the increasingly large quantities of data that they produce. The CEBAF Large Acceptance Spectrometer (CLAS12) employs a highly efficient Level 3 electron trigger to filter the amount of data recorded by requiring at least one electron in each event, at the cost of a low purity in...
One common issue in vastly different fields of research and industry is the ever-increasing need for more data storage. With experiments taking more complex data at higher rates, the data recorded is quickly outgrowing the storage capabilities. This issue is very prominent in LHC experiments such as ATLAS where in five years the resources needed are expected to be many times larger than the...
Madgraph5_aMC@NLO is one of the workhorses for Monte Carlo event generation in the LHC experiments and an important consumer of compute resources. The software has been reengineered to maintain the overall look-and-feel of the user interface while achieving very large overall speedups. The computationally intensive part (the calculation of "matrix elements") is offloaded to new implementations...
The ML_INFN initiative (“Machine Learning at INFN”) is an effort to foster Machine Learning activities at the Italian National Institute for Nuclear Physics (INFN).
In recent years, AI inspired activities have flourished bottom-up in many efforts in Physics, both at the experimental and theoretical level.
Many researchers have procured desktop-level devices, with consumer oriented GPUs,...
PARSIFAL (PARametrized SImulation) is a software tool that can reproduce the complete response of both triple-GEM and micro-RWELL based trackers. It takes into account the involved physical processes by their simple parametrization and thus in a very fast way. Existing software as GARFIELD++ are robust and reliable, but very CPU time consuming. The implementation of PARSIFAL was driven by the...
ROOT's TTree data structure has been highly successful and useful for HEP; nevertheless, alternative file formats now exist which may offer broader software tool support and more-stable in-memory interfacing. We present a data serialization library that produces a similar data structure within the HDF5 data format; supporting C++ standard collections, user-defined data types, and schema...
The increasingly larger data volumes that the LHC experiments will accumulate in the coming years, especially in the High-Luminosity LHC era, call for a paradigm shift in the way experimental datasets are accessed and analyzed. The current model, based on data reduction on the Grid infrastructure, followed by interactive data analysis of manageable size samples on the physicists’ individual...
The LHCb software stack is developed in C++ and uses the Gaudi framework for event processing and DD4hep for the detector description. Numerical computations are done either directly in the C++ code or by an evaluator used to process the expressions embedded in the XML describing the detector geometry.
The current system relies on conventions for the physical units used (identical as what...
An important area of HEP studies at the LHC currently concerns the need for more extensive and precise comparison data. Important tools in this realm are event reweighting and the evaluation of more precise next-to-leading order (NLO) physics processes via Monte Carlo (MC) event generators, especially in the context of the upcoming High Luminosity LHC phase. Current event generators need to...
There is increasing demand for the efficiency and flexibility of data transport systems supporting data-intensive sciences. With growing data volume, it is essential that the transport system of a data-intensive science project fully utilize all available transport resources (e.g., network bandwidth); to achieve statistical multiplexing gain, there is an increasing trend that multiple projects...
Long-lived particles (LLPs) are very challenging to search for with current detectors and computing requirements, due to their very displaced vertices. This study evaluates the ability of the trigger algorithms used in the Large Hadron Collider beauty (LHCb) experiment to detect long-lived particles and attempts to adapt them to enhance the sensitivity of this experiment to undiscovered...
The Super Tau Charm Facility (STCF) proposed in China is a new-generation electron–positron collider with center-of-mass energies covering 2-7 GeV. In STCF, the discrimination of high momentum hadrons is a challenging and critical task for various physics studies. In recent years, machine learning methods have gradually become one of the mainstream methods in the PID field of high energy...
Over the last few years, Cloud Sync&Share platforms have become go-to services for collaboration in scientific, academic and research environments, providing users with coherent and simple ways to access their data assets. Collaboration within those platforms, between local users on local applications, has been demonstrated in various settings, with visible improvements in the research...
AtlFast3 is the new, high-precision fast simulation in ATLAS that was deployed by the collaboration to replace AtlFastII, the fast simulation tool that was successfully used for most of Run2. AtlFast3 combines a parametrization-based Fast Calorimeter Simulation and a new machine-learning-based Fast Calorimeter Simulation based on Generative Adversarial Networks (GANs). The new fast simulation...
Prior to the start of the LHC Run 3, the US ATLAS Software and Computing operations program established three shared Tier 3 Analysis Facilities (AFs). The newest AF was established at the University of Chicago in the past year, joining the existing AFs at Brookhaven National Lab and SLAC National Accelerator Lab. In this paper, we will describe both the common and unique aspects of these three...
The LIGO, VIRGO and KAGRA Gravitational-wave (GW) observatories are getting ready for their fourth observational period, O4, scheduled to begin in March 2023, with improved sensitivities and thus higher event rates.
GW-related computing has both large commonalities with HEP computing, particularly in the domain of offline data processing and analysis, and important differences, for example in...
Zenodo has over the past 10 years grown from a proof of concept to being the world's largest general-purpose research repository, cementing CERN’s image as a pioneer and leader in Open Science. We will review key challenges faced over the past 10 years and how we overcame them, from getting off the ground, over building trust to securing funding.
Growing Zenodo was an enriching and...
In 2029 the LHC will start the High-Luminosity LHC (HL-LHC) program, with a boost in the integrated luminosity resulting in an unprecedented amount of experimental and simulated data samples to be transferred, processed and stored in disk and tape systems across the Worldwide LHC Computing Grid (WLCG). Content delivery network (CDN) solutions are being explored with the purposes of improving...
After using ROOT TTree for over two decades and storing more than an exabyte of compressed data, advances in technology have motivated a complete redesign, RNTuple, that breaks backward-compatibility to take better advantage of these storage options. The RNTuple I/O subsystem has been designed to address performance bottlenecks and shortcomings of ROOT's current state of the art TTree I/O...
The IceCube Neutrino Observatory is a cubic kilometer neutrino telescope
located at the Geographic South Pole. For every observed neutrino event,
there are over 10^6 background events caused by cosmic-ray air shower
muons. In order to properly separate signal from background, it is
necessary to produce Monte Carlo simulations of these air showers.
Although to-date, IceCube has...
The main focus of the ALICE experiment, quark-gluon plasma measurements, requires
accurate particle identification (PID). The ALICE detectors allow identifying particles over a broad momentum interval ranging from about 100 MeV/c up to 20 GeV/c.
However, hand-crafted selections and the Bayesian method do not perform well in the
regions where the particle signals overlap. Moreover, an ML...
Applying graph-based techniques, and graph neural networks (GNNs) in particular, has been shown to be a promising solution to the high-occupancy track reconstruction problems posed by the upcoming HL-LHC era. Simulations of this environment present noisy, heterogeneous and ambiguous data, which previous GNN-based algorithms for ATLAS ITk track reconstruction could not handle natively. We...
The Large Field Low-energy X-ray Polarization Detector (LPD) is a gas photoelectric effect polarization detector designed for the detailed study of X-ray temporary sources in high-energy astrophysics. Previous studies have shown that the polarization degree of gamma ray bursts (GRBs) is generally low or unpolarized. Considering the spatial background and other interferences, We need high...
The HL-LHC run is anticipated to start at the end of this decade and will pose a significant challenge for the scale of the HEP software and computing infrastructure. The mission of the U.S. CMS Software & Computing Operations Program is to develop and operate the software and computing resources necessary to process CMS data expeditiously and to enable U.S. physicists to fully participate in...
Effective analysis computing requires rapid turnaround times in order to enable frequent iteration, adjustment, and exploration, leading to discovery. An informal goal of reducing 10TB of experimental data in about ten minutes using campus-scale computing infrastructure is an achievable goal, just considering raw hardware capability. However, compared to production computing, which seeks to...
Celeritas is a new Monte Carlo detector simulation code designed for computationally intensive applications (specifically, HL-LHC simulation) on high-performance heterogeneous architectures. In the past two years Celeritas has advanced from prototyping a simple, GPU-based, single-physics-model infinite medium to implementing a full set of electromagnetic physics processes in complex...
The INFN Cloud project was launched at the beginning of 2020, aiming to build a distributed Cloud infrastructure and provide advanced services for the INFN scientific communities. A Platform as a Service (PaaS) was created inside INFN Cloud that allows the experiments to develop and access resources as a Software as a Service (SaaS), and CYGNO is the beta-tester of this system. The aim of the...
Analyses in HEP experiments often rely on large MC simulated datasets. These datasets are usually produced with full-simulation approaches based on Geant4 or exploiting parametric “fast” simulations introducing approximations and reducing the computational cost. With our work we created a prototype version of a new “fast” simulation that we named “flashsim” targeting analysis level data tiers...
The High-Energy Physics (HEP) and Worldwide LHC Computing Grid (WLCG) communities have faced significant challenges in understanding their global network flows across the world’s research and education (R&E) networks. When critical links, such as transatlantic or transpacific connections, experience high traffic or saturation, it is very challenging to clearly identify which collaborations...
The "A Large Ion Collider Experiment" (ALICE), one of the four large experiments at the European Organization for Nuclear Research (CERN), is responsible for studying the physics of strongly interacting matter and the quark-gluon plasma.
In order to ensure the full success of ALICE operation and data taking during the Large Hadron Collider Runs 3 and 4, a list of tasks identified as Service...
Detailed detector simulation is the major consumer of CPU resources at LHCb, having used more than 80% of the total computing budget during Run 2 of the Large Hadron Collider at CERN. As data is collected by the upgraded LHCb detector during Run 3 of the LHC, larger requests for simulated data samples are necessary, and will far exceed the pledged resources of the experiment, even with...
RDataFrame is ROOT's high-level interface for Python and C++ data analysis. Since it first became available, RDataFrame adoption has grown steadily and it is now poised to be a major component of analysis software pipelines for LHC Run 3 and beyond. Thanks to its design inspired by declarative programming principles, RDataFrame enables the development of high-performance, highly parallel...
Development of the EIC project detector "ePIC" is now well underway and this includes the "single software stack" used for simulation and reconstruction. The stack combines several non-experiment-specific packages including ACTS, DD4hep, JANA2, and PODIO. The software stack aims to be forward looking in the era of AI/ML and heterogeneous hardware. A formal decision making process was...
AtlFast3 is the new ATLAS fast simulation that exploits a wide range of ML techniques to achieve high-precision fast simulation. The latest version of the AtlFast3 used in Run3 deploys FastCaloGANV2 which consists of 500 Generative Adversarial Networks used to simulate the showers of all particles in the ATLAS calorimeter system. The Muon Punch Through tool has also been completely rewritten...
In this talk, we discuss the evolution of the computing model of the ATLAS experiment at the LHC. After LHC Run 1, it became obvious that the available computing resources at the WLCG were fully used. The processing queue could reach millions of jobs during peak loads, for example before major scientific conferences and during large scale data processing. The unprecedented performance of the...
The capture and curation of all primary instrument data is a potentially valuable source of added insight into experiments or diagnostics in laboratory experiments. The data can, when properly curated, enable analysis beyond the current practice that uses just a subset of the as-measured data. Complete curated data can also be input for machine learning and other data exploration tools....
The current and future programs for accelerator-based neutrino imaging detectors feature the use of Liquid Argon Time Projection Chambers (LArTPC) as the fundamental detection technology. These detectors combine high-resolution imaging and precision calorimetry to enable the study of neutrino interactions with unparalleled capabilities. However, the volume of data from LArTPCs will exceed 25...
In November 2022, the HEP Software Foundation (HSF) and the Institute for Research and Innovation for Software in High-Energy Physics (IRIS-HEP) organized a workshop on the topic of “Software Citation and Recognition in HEP”. The goal of the workshop was to bring together different types of stakeholders whose roles relate to software citation and the associated credit it provides, in order to...
Instead of focusing on the concrete challenges of incremental changes to HEP driven by AI/ML, it is perhaps a useful exercise to think through more radical, speculative changes. What might be enabled if we embraced a dramatically different approach? What would we lose? How would those changes impact the computational, organizational, and epistemological nature of the field?
Simulation is a critical component of high energy physics research, with a corresponding computing footprint. Generative AI has emerged as a promising complement to intensive full simulation with relatively high accuracy compared to existing classical fast simulation alternatives. Such algorithms are naturally suited to acceleration on coprocessors, potentially running fast enough to match the...
The EU-funded ESCAPE project has brought together the ESFRI and other world class Research Infrastructures in High Energy and Nuclear Physics, Astro-Particle Physics, and Astronomy. In the 3 years of the project many synergistic and collaborative aspects have been highlighted and explored, from pure technical collaboration on common solutions for data management, AAI, and workflows, through...
The large data volumes expected from the High Luminosity LHC (HL-LHC) present challenges to existing paradigms and facilities for end-user data analysis. Modern cyberinfrastructure tools provide a diverse set of services that can be composed into a system that provides physicists with powerful tools that give them straightforward access to large computing resources, with low barriers to entry....
Future e+e- colliders are crucial to extend the search for new phenomena possibly related to the open questions that the Standard Model presently does not explain. Among the major physics programs, the flavor physics program requires particle identification (PID) performances well beyond that of most detectors designed for the current generation. The cluster counting, which measures the number...
The EPIC collaboration at the Electron-Ion Collider recently laid the groundwork for its software infrastructure. Large parts of the software ecosystem for EPIC mirror the setup from the Key4hep project, for example DD4hep for geometry description, and EDM4hep/PODIO for the data model. However, other parts of the EPIC software ecosystem diverge from Key4hep, for example for the event...
The usage of Deep Neural Networks (DNNs) as multi-classifiers is widespread in modern HEP analyses. In standard categorisation methods, the high-dimensional output of the DNN is often reduced to a one-dimensional distribution by exclusively passing the information about the highest class score to the statistical inference method. Correlations to other classes are hereby omitted.
Moreover, in...
Computational science, data management and analysis have been key factors in the success of Brookhaven National Laboratory's scientific programs at the Relativistic Heavy Ion Collider (RHIC), the National Synchrotron Light Source (NSLS-II), the Center for Functional Nanomaterials (CFN), and in biological, atmospheric, and energy systems science, Lattice Quantum Chromodynamics (LQCD) and...
CERN hosts more than 1200 websites essential for the mission of the Organization, internal and external collaboration and communicaiton as well as public outreach. The complexity and scale of CERN’s online presence is very diverse with some websites, like https://home.cern/
, accommodating more than one million unique visitors in a day.
However, regardless of their diversity, all...
High energy physics is facing serious challenges in the coming decades due to the projected shortfall of CPU and storage resources compared to our anticipated budgets. In the past, HEP has not made extensive use of HPCs, however the U.S. has had a long term investment in HPCs and it is the platform of choice for many simulation workloads, and more recently, data processing for projects such as...
Synergies between MAchine learning, Real-Time analysis and Hybrid architectures for efficient Event Processing and decision making (SMARTHEP) is a European Training Network with the aim of training a new generation of Early Stage Researchers to advance real-time decision-making, effectively leading to data-collection and analysis becoming synonymous.
SMARTHEP...
We present a collection of tools and processes that facilitate onboarding a new science collaboration onto the OSG Fabric of Services. Such collaborations typically rely on computational workflows for simulations and analysis that are ideal for executing on OSG's distributed High Throughput Computing environment (dHTC). The produced output can be accumulated and aggregated at available...
The Xrootd S3 Gateway is a universal high performance proxy service that can be used to access S3 portals using existing HEP credentials (e.g. JSON Web Tokens and x509). This eliminates one of the biggest roadblocks to using public cloud storage resources. This paper describes how the S3 Gateway leverages existing HEP software (e.g. Davix and XRootD) to provide a familiar scalable service that...
The search for the dimuon decay of the Standard Model (SM) Higgs boson looks for a tiny peak on top of a smoothly falling SM background in the dimuon invariant mass spectrum 𝑚(𝜇𝜇). Due to the very small signal-to-background ratio, which is at the level of 0.2% in the region 𝑚(𝜇𝜇) = 120–130 GeV for an inclusive selection, an accurate determination of the background is of paramount importance....
The INFN-CNAF Tier-1 located in Bologna (Italy) is a center of the WLCG e-Infrastructure providing computing power to the four major LHC collaborations and also supports the computing needs of about fifty more groups - also from non HEP research domains. The CNAF Tier1 center has been historically very active putting effort in the integration of computing resources, proposing and prototyping...
There is no lack of approaches for managing the deployment of distributed services – in the last 15 years of running distributed infrastructure, the OSG Consortium has seen many of them. One persistent problem has been each physical site has its style of configuration management and service operations, leading to a partitioning of the staff knowledge and inflexibility in migrating services...
Moving towards Net-Zero requires robust information to enable good decision making at all levels: covering hardware procurement, workload management and operations, as well as higher level aspects encompassing grant funding processes and policy framework development.
The IRISCAST project is a proof-of-concept study funded as part of the UKRI Net-Zero Scoping Project. We have performed an...
The Jiangmen Underground Neutrino Observatory (JUNO), under construction in South China, primarily aims to determine the neutrino mass hierarchy and the precise measure oscillation parameters. The data-taking is expected to start in 2024 and plans to run for more than 20 years. The development of JUNO offline software (JUNOSW) started in 2012, and it is quite challenging to maintain the JUNOSW...
The reconstruction of charged particles’ trajectories is one of the most complex and CPU-consuming event processing chains in high energy physics (HEP) experiments. Meanwhile, the precision of track reconstruction has direct and significant impact on vertex reconstruction, physics flavour tagging and particle identfication, and eventually on physics precision, in particular for HEP experiments...
There has been a significant increase in data volume from various large scientific projects, including the Large Hadron Collider (LHC) experiment. The High Energy Physics (HEP) community requires increased data volume on the network, as the community expects to produce almost thirty times annual data volume between 2018 and 2028 [1]. To mitigate the repetitive data access issue and network...
Recent inroads in Computer Vision (CV), enabled by Machine Learning (ML), have motivated a new approach to the analysis of particle imaging detector data. Unlike previous efforts which tackled isolated CV tasks, this paper introduces an end-to-end, ML-based data reconstruction chain for Liquid Argon Time Projection Chambers (LArTPCs), the state-of-the-art in precision imaging at the intensity...
In this contribution we describe the 2022 reboot of the ScienceBox project, the containerised SWAN/CERNBox/EOS demonstrator package for CERN storage and analysis services. We evolved the original implementation to make use of Helm charts across the entire dependency stack. Charts have become the de-facto standard for application distribution and deployment in managed clusters (e.g.,...
In the frame of the German NFDI (National Research Data Infrastructure), by now 27 consortia across all domains of science have been setup in order to enhance the FAIR usage and re-usage of scientific data. The consortium PUNCH4NFDI, composed of the German particle, astroparticle, hadron&nuclear, and astrophysics communities, has been approved for initially 5 years of significant...
The CMS Online Monitoring System (OMS) aggregates and integrates different sources of information into a central place and allows users to view, compare and correlate information. It displays real-time and historical information.
The tool is heavily used by run coordinators, trigger experts and shift crews, to achieve optimal trigger and efficient data taking. It provides aggregated...
I will introduce a new neural algorithm -- HyperTrack, designed for exponentially demanding combinatorial inverse problems of high energy physics final state reconstruction and high-level analysis at the LHC and beyond. Many of these problems can be formulated as clustering on a graph resulting in a hypergraph. The algorithm is based on a machine learned geometric-dynamical input graph...
The ATLAS Continuous Integration (CI) System is the major component of the ATLAS software development infrastructure, synchronizing efforts of several hundred software developers working around the world and around the clock. Powered by 700 fast processors, it is based on the ATLAS GitLab code management service and Jenkins CI server and performs daily up to 100 ATLAS software builds probing...
Current and future distributed HENP data analysis infrastructures rely increasingly on object stores in addition to regular remote file systems. Such file-less storage systems are popular as a means to escape the inherent scalability limits of the POSIX file system API. Cloud storage is already dominated by S3-like object stores, and HPC sites are starting to take advantage of object stores...
LUX-ZEPLIN (LZ) is a direct detection dark matter experiment currently operating at the Sanford Underground Research Facility (SURF) in Lead, South Dakota. The core component is a liquid xenon time projection chamber with an active mass of 7 tonnes.
To meet the performance, availability, and security requirements for the LZ DAQ, Online, Slow Control and data transfer systems located at SURF,...
We present New Physics Learning Machine (NPLM), a machine learning-based strategy to detect data departures from a Reference model, with no prior bias on the source of discrepancy. The main idea behind the method is to approximate the optimal log-likelihood-ratio hypothesis test parametrising the data distribution with a universal approximating function, and solving its maximum-likelihood fit...
The computing and storage requirements of the energy and intensity frontiers will grow significantly during the Run 4 & 5 and the HL-LHC era. Similarly, in the intensity frontier, with larger trigger readouts during supernovae explosions, the Deep Underground Neutrino Experiment (DUNE) will have unique computing challenges that could be addressed by the use of parallel and accelerated...
The increasing computational demand in High Energy Physics (HEP) as well as increasing concerns about energy efficiency in high performance/throughput computing are driving forces in the search for more efficient ways to utilize available resources. Since avoiding idle resources is key in achieving high efficiency, an appropriate measure is sharing of idle resources of under-utilized sites...
Data-driven methods are widely used to overcome shortcomings of Monte Carlo (MC) simulations (lack of statistics, mismodeling of processes, etc.) in experimental High Energy Physics. A precise description of background processes is crucial to reach the optimal sensitivity for a measurement. However, the selection of the control region used to describe the background process in a region of...
Planned EOSC-CZ projects will significantly improve data management in many scientific fields in the Czech Republic. Several calls for projects are under preparation according to the implementation architecture document created in 2021. Emerging National data infrastructure will build basic infrastructure with significant storage capacity for long term archive of scientific data and their...
Recent years have seen an increasing interest in the environmental impact, especially the carbon footprint, generated by the often large scale computing facilities used by the communities represented at CHEP. As this is a fairly new requirement, this information is not always readily available, especially at universities and similar institutions which do not necessarily see large scale...
Random number generation is key to many applications in a wide variety of disciplines. Depending on the application, the quality of the random numbers from a particular generator can directly impact both computational performance and critically the outcome of the calculation.
High-energy physics applications use Monte Carlo simulations and machine learning widely, which both require...
Modern neutrino experiments employ hundreds to tens of thousands of photon detectors to detect scintillation photons produced from the energy deposition of charged particles. A traditional approach of modeling individual photon propagation as a look-up table requires high computational resources, and therefore it is not scalable for future experiments with multi-kiloton target volume.
We...
The ALICE experiment at CERN uses a cluster consisting of virtual and bare-metal machines to build and test proposed changes to the ALICE Online-Offline (O2) software in addition to building and publishing regular software releases.
Nomad is a free and open-source job scheduler for containerised and non-containerised applications developed by Hashicorp. It is integrated into an...
For Run 3, ATLAS redesigned its offline software, Athena, so that the
main workflows run completely multithreaded. The resulting substantial
reduction in the overall memory requirements allows for better use
of machines with many cores. This talk will discuss the performance
achieved by the multithreaded reconstruction as well as the process
of migrating the large ATLAS code base and...
At Brookhaven National Lab, the dCache storage management system is used as a disk cache for large high-energy physics (HEP) datasets primarily from the ATLAS experiment[1]. Storage space on dCache is considerably smaller than the full ATLAS data collection. Therefore, a policy is needed to determine what data files to keep in the cache and what files to evict. A good policy is to keep...
In this talk, we present a novel data format design that obviates the need for data tiers by storing individual event data products in column objects. The objects are stored and retrieved through Ceph S3 technology, and a companion metadata system handles tracking of the object lifecycle. Performance benchmarks of data storage and retrieval will be presented, along with scaling tests of the...
Many theories of Beyond Standard Model (BSM) physics feature multiple BSM particles. Generally, these theories live in higher dimensional phase spaces that are spanned by multiple independent BSM parameters such as BSM particle masses, widths, and coupling constants. Fully probing these phase spaces to extract comprehensive exclusion regions in the high dimensional space is challenging....
High Energy Physics experiments at the Large Hadron Collider generate petabytes of data that go though multiple transformation before final analysis and paper publication. Recording the provenance of these data is therefore crucial to maintain the quality of the final results. While the tools are in place within LHCb to keep this information for the common experiment-wide transforms, analysts...
The JIRIAF project aims to combine geographically diverse computing facilities into an integrated science infrastructure. This project starts by dynamically evaluating temporarily unallocated or idled compute resources from multiple providers. These resources are integrated to handle additional workloads without affecting local running jobs. This paper describes our approach to launch...
During the long shutdown between LHC Run 2 and 3, a reprocessing of 2017 and 2018 CMS data with higher granularity data quality monitoring (DQM) harvesting was done. The time granularity of DQM histograms in this dataset is increased by 3 orders of magnitude. In anticipation of deploying this higher granularity DQM harvesting in the ongoing Run 3 data taking, this dataset is used to study the...
The INFN Tier1 data center is currently located in the premises of the Physics Department of the University of Bologna, where CNAF is also located. During 2023 it will be moved to the “Tecnopolo”, the new facility for research, innovation, and technological development in the same city area; the same location is also hosting Leonardo, the pre-exascale supercomputing machine managed by CINECA,...
The Deep Underground Neutrino Experiment (DUNE) will operate four large-scale Liquid-Argon Time-Projection Chambers (LArTPCs) at the far site in South Dakota, producing high-resolution images of neutrino interactions.
LArTPCs represent a step-change in neutrino interaction imaging and the resultant images can be highly detailed and complex. Extracting the maximum value from LArTPC hardware...
GitLab has been running at CERN since 2012. It is a self-service code hosting application based on Git that provides collaboration and code review features, becoming one of the key infrastructures at CERN. It is being widely used at CERN, with more than 17 000 active users, hosting more than 120 000 projects and triggering more than 5 000 jobs per hour.
On its initial stage, a custom-made...
Large-scale high-energy physics experiments generate petabytes or even exabytes of scientific data, and high-performance data IO is required during their processing. However, computing and storage devices are often separated in large computing centers, and large-scale data transmission has become a bottleneck for some data-intensive computing tasks, such as data encoding and decoding,...
Queen Mary University of London (QMUL) as part of the refurbishment of one of its's data centres has installed water to water heat pumps to use the heat produced by the computing servers to provide heat for the university via a district heating system. This will enable us to reduce the use of high carbon intensity natural gas heating boilers, replacing them with electricity which has a lower...
The matrix element method (MEM) is a powerful technique that can be used for the analysis of particle collider data utilizing an ab initio calculation of the approximate probability density function for a collision event to be due to a physics process of interest. The most serious difficulty with the ME method, which has limited its applicability to searches for beyond-the-SM physics and...
The IceCube experiment has substantial simulation needs and is in continuous search for the most cost-effective ways to satisfy them. The most CPU-intensive part relies on CORSIKA, a cosmic ray air shower simulation. Historically, IceCube relied exclusively on x86-based CPUs, like Intel Xeon and AMD EPYC, but recently server-class ARM-based CPUs are also becoming available, both on-prem and in...
Rucio is a software framework that provides scientific collaborations with the ability to organise, manage and access large volumes of data using customisable policies. The data can be spread across globally distributed locations and across heterogeneous data centres, uniting different storage and network technologies as a single federated entity. Rucio offers advanced features such as...
The Exa.TrkX team has developed a Graph Neural Network (GNN) for reconstruction of liquid argon time projection chamber (LArTPC) data. We discuss the network architecture, a multi-head attention message passing network that classifies detector hits according to the particle type that produced them. By utilizing a heterogeneous graph structure with independent subgraphs for each 2D plane’s hits...
The increasingly pervasive and dominant role of machine learning (ML) and deep learning (DL) techniques in High Energy Physics is posing challenging requirements to effective computing infrastructures on which AI workflows are executed, as well as demanding requests in terms of training and upskilling new users and/or future developers of such technologies.
In particular, a growth in the...
The Worldwide LHC Computing Grid (WLCG) is a large-scale collaboration which gathers the computing resources of around 170 computing centres from more than 40 countries. The grid paradigm, unique to the realm of high energy physics, has successfully supported a broad variety of scientific achievements. To fulfil the requirements of new applications and to improve the long-term sustainability...
The goal of the “HTTP REST API for Tape” project is to provide a simple, minimalistic and uniform interface to manage data transfers between Storage Endpoints (SEs) where the source file is on tape. The project is a collaboration between the developers of WLCG storage systems (EOS+CTA, dCache, StoRM) and data transfer clients (gfal2, FTS). For some years, HTTP has been growing in popularity as...
Significant progress has been made in applying graph neural networks (GNNs) and other geometric ML ideas to the track reconstruction problem. State-of-the-art results are obtained using approaches such as the Exatrkx pipeline, which currently applies separate edge construction, classification and segmentation stages. One can also treat the problem as an object condensation task, and cluster...
HEPscore is a CPU benchmark, based on HEP applications, that the HEPiX Working Group is proposing as a replacement of the HEPSpec06 benchmark (HS06), which is currently used by the WLCG for procurement, computing resource requests and pledges, accounting and performance studies. At the CHEP 2019 conference, we presented the reasons for building a benchmark for the HEP community that is based...
The ATLAS experiment involves almost 6000 members from approximately 300 institutes spread all over the globe and more than 100 papers published every year. This dynamic environment brings some challenges such as how to ensure publication deadlines, communication between the groups involved, and the continuity of workflows. The solution found for those challenges was automation, which was...
The Large Hadron Collider (LHC) will be upgraded to High-luminosity LHC, increasing the number of simultaneous proton-proton collisions (pile-up, PU) by several-folds. The harsher PU conditions lead to exponentially increasing combinatorics in charged-particle tracking, placing a large demand on the computing resources. The projection on required computing resources exceeds the computing...
Among liquid argon time projection chamber (LArTPC) experiments MicroBooNE is the one that continually took physics data for the longest time (2015-2021), and represents the state of the art for reconstruction and analysis with this detector. Recently published analyses include oscillation physics results, searches for anomalies and other BSM signatures, and cross section measurements. LArTPC...
The CMS experiment at CERN accelerates several stages of its online reconstruction by making use of GPU resources at its High Level Trigger (HLT) farm for LHC Run 3. Additionally, during the past years, computing resources available to the experiment for performing offline reconstruction, such as Tier-1 and Tier-2 sites, have also started to integrate accelerators into their systems. In order...
Data taking at the Large Hadron Collider (LHC) at CERN restarted in 2022. The CMS experiment relies on a distributed computing infrastructure based on WLCG (Worldwide LHC Computing Grid) to support the LHC Run 3 physics program. The CMS computing infrastructure is highly heterogeneous and relies on a set of centrally provided services, such as distributed workload management and data...
Recent work has demonstrated that graph neural networks (GNNs) trained for charged particle tracking can match the performance of traditional algorithms while improving scalability. Most approaches are based on the edge classification paradigm, wherein tracker hits are connected by edges, and a GNN is trained to prune edges, resulting in a collection of connected components representing...
The ATLAS Open Data project aims to deliver open-access resources for education and outreach in High Energy Physics using real data recorded by the ATLAS detector. The Open Data release so far has resulted in the release of a substantial amount of data from 8 TeV and 13 TeV collisions in an easily-accessible format and supported by dedicated software and documentation to allow its fruitful use...
Monitoring services play a crucial role in the day-to-day operation of distributed computing systems. The ATLAS experiment at LHC uses the production and distributed analysis workload management system (PanDA WMS), which allows a million computational jobs to run daily at over 170 computing centers of the WLCG and other opportunistic resources, utilizing 600k cores simultaneously on average....
CDS (Custodial Disk Storage), a disk-based custodial storage powered by CERN EOS storage system, has been operating for the ALICE experiment at the KISTI Tier-1 Centre since November 2021. The CDS replaced existing tape storage operated for almost a decade, after its stable demonstration in the WLCG Tape Challenges in October 2021. We tried to challenge the economy of tape storage in the...
The LHCb experiment is currently taking data with a completely renewed DAQ system, capable for the first time of performing a full real-time reconstruction of all collision events occurring at LHC point 8.
The Collaboration is now pursuing a further upgrade (LHCb "Upgrade-II"), to enable the experiment to retain the same capability at luminosities an order of magnitude larger than the maximum...
As the largest particle physics laboratory in the world, CERN has more than 17000 collaborators spread around the globe. ATLAS, one of CERN’s experiments, has around 6000 active members and 300 associate institutes, all of which must go through the standard registration and updating procedures within CERN’s HR (Foundation) database. Simultaneously, the ATLAS Glance project, among other...
(on behalf of the JUNO Collaboration)
Jiangmen Underground Neutrino Observatory (JUNO), under construction in southern China, is a multi-purpose neutrino experiment designed to determine the neutrino mass hierarchy and precisely measure oscillation parameters. Equipped with a 20-kton liquid scintillator central detector viewed by 17,612 20-inch and 25,6000 3-inch photomultiplier tubes, JUNO...
The LHCb experiment uses a triggerless readout system where its first stage (HLT1) is implemented on GPU cards. The full LHC event rate of 30 MHz is reduced to 1 MHz using efficient parallellisation techniques in order to meet throughput requirements. The GPU cards are hosted in the same servers as the FPGA cards receiving the detector data which reduces the network to a minimum. In this talk,...
Among the biggest computational challenges for High Energy Physics (HEP) experiments there are the increasingly larger datasets that are being collected, which often require correspondingly complex data analyses. In particular, the PDFs used for modeling the experimental data can have hundreds of free parameters. The optimization of such models involves a significant computational effort and a...
The BaBar experiment collected electron-positron collisions at the SLAC National Accelerator Laboratory from 1999-2008. Although data taking has stopped 15 years ago, the collaboration is still actively doing data analyses, publishing results, and giving presentations at international conferences. Special considerations were needed to do analyses using a computing environment that was...
Track reconstruction is one of the most important and challenging tasks in the offline data processing of collider experiments. For the BESIII detector working in the tau-charm energy region, plenty of efforts were made previously to improve the tracking performance with traditional methods, such as template matching and Hough transform etc. However, for difficult tracking tasks, such as the...
To accurately describe data, tuning the parameters of MC event Generators is essential. At first, experts performed tunings manually based on their sense of physics and goodness of fit. The software, Professor, made tuning more objective by employing polynomial surrogate functions to model the relationship between generator parameters and experimental observables (inner-loop optimization),...
The LHCb experiment is one of the 4 LHC experiments at CERN. With more than 1500 members and tens of thousands of assets, the Collaboration requires systems that allow the extraction of data from many databases according to some very specific criteria. In LHCb there are 4 production web applications responsible for managing members and institutes, tracking assets and their current status,...
The former CMS Run 2 High Level Trigger (HLT) farm is one of the largest contributors to CMS compute resources, providing about 30k job slots for offline computing. The role of this farm has been evolving, from an opportunistic resource exploited during inter-fill periods in the LHC Run 2, to a nearly transparent extension of the CMS capacity at CERN during LS2 and into the LHC Run 3 started...
The software based High Level Trigger (HLT) of CMS reduces the data readout rate from 100kHz (obtained from Level 1 trigger) to around 2kHz. It makes use of all detector subsystems and runs a streamlined version of CMS reconstruction. Run-1 and Run-2 of the LHC saw the reconstruction algorithm run on a CPU farm (~30000 CPUs in 2018). But the need to have increased computational power as we...
The High-Luminosity LHC (HL-LHC) will provide an order of magnitude increase in integrated luminosity and enhance the discovery reach for new phenomena. The increased pile-up foreseen during the HL-LHC necessitates major upgrades to the ATLAS detector and trigger. The Phase-II trigger will consist of two levels, a hardware-based Level-0 trigger and an Event Filter (EF) with tracking...
The High Luminosity upgrade to the LHC (HL-LHC) is expected to deliver scientific data at the multi-exabyte scale. In order to address this unprecedented data storage challenge, the ATLAS experiment launched the Data Carousel project in 2018. Data Carousel is a tape-driven workflow whereby bulk production campaigns with input data resident on tape are executed by staging and promptly...
The University of Victoria (UVic) operates an Infrastructure-as-a-Service science cloud for Canadian researchers, and a WLCG T2 grid site for the ATLAS experiment at CERN. At first, these were two distinctly separate systems, but over time we have taken steps to migrate the T2 grid services to the cloud. This process has been significantly facilitated by basing our approach on Kubernetes, a...
In this paper we discuss the CMS open data publishing workflows, summarising experience with eight releases of CMS open data on the CERN Open Data portal since its initial launch in 2014. We present the recent enhancements of data curation procedures, including (i) mining information about collision and simulated datasets with accompanying generation parameters and processing configuration...
HammerCloud (HC) is a testing service and framework for continuous functional tests, on-demand large-scale stress tests, and performance benchmarks. It checks the computing resources and various components of distributed systems with realistic full-chain experiment workflows.
The HammerCloud software was initially developed in Python 2. After support for Python 2 was discontinued in 2020,...
The CERN IT Department is responsible for ensuring the integrity and security of data stored in the IT Storage Services. General storage backends such as EOSHOME/PROJECT/MEDIA and CEPHFS are used to store data for a wide range of use cases for all stakeholders at CERN, including experiment project spaces and user home directories.
In recent years a backup system, CBACK, was developed based...
To better understand experimental conditions and performances of the Large Hadron Collider (LHC), CERN experiments execute tens of thousands of loosely-coupled Monte Carlo simulation workflows per hour on hundreds of thousands - small to mid-size - distributed computing resources federated by the Worldwide LHC Computing Grid (WLCG). While this approach has been reliable during the first LHC...
The Super Tau Charm Facility (STCF) proposed in China is a new-generation electron–positron collider with center-of-mass energies covering 2-7 GeV and a peak luminosity of 5*10^34 cm^-2s^-1. The offline software of STCF (OSCAR) is developed to support the offline data processing, including detector simulation, reconstruction, calibration as well as physics analysis. To meet STCF’s specific...
The Glance project is responsible for over 20 systems across three CERN experiments: ALICE, ATLAS and LHCb. Students, engineers, physicists and technicians have been using systems designed and managed by Glance on a daily basis for over 20 years. In order to produce quality products continuously, considering internal stakeholder's ever-evolving requests, there is the need of standardization....
Particle track reconstruction is the most computationally intensive process in nuclear physics experiments.
Traditional algorithms use a combinatorial approach that exhaustively tests track measurements (hits) to
identify those that form an actual particle trajectory. In this article we describe the development of machine
learning models that assist the tracking algorithm by identifying...
Performing a physics analysis of data from simulations of a high energy experiment requires the application of several common procedures, from obtaining and reading the data to producing detailed plots for interpretation. Implementing common procedures in a general analysis framework allows the analyzer to focus on the unique parts of their analysis. Over the past few years, EIC simulations...
The ATLAS experiment at CERN is one of the largest scientific machines built to date and will have ever growing computing needs as the Large Hadron Collider collects an increasingly larger volume of data over the next 20 years. ATLAS is conducting R&D projects on Amazon and Google clouds as complementary resources for distributed computing, focusing on some of the key features of commercial...
Making the large datasets collected at the LHC accessible to the public is a considerable challenge given the complexity and volume of data. Yet to harness the full scientific potential of the facility, it is essential to enable meaningful access to the data by the broadest physics community possible. Here we present an application, the LHCb Ntuple Wizard, which leverages the existing...
Data from the LHC detectors are not easily represented using regular data structures. These detectors are comprised of several species of subdetectors and therefore produce heterogeneous data. LHC detectors are granular by design so that nearby particles may be distinguished. As a consequence, LHC data are sparse, in that many detector channels are not active during a given collision event....
Operational analytics is the direction of research related to the analysis of the current state of computing processes and the prediction of the future in order to anticipate imbalances and take timely measures to stabilize a complex system. There are two relevant areas in ATLAS Distributed Computing that are currently in the focus of studies: end-user physics analysis including the forecast...
FastCaloSim is a parameterized simulation of the particle energy response and of the energy distribution in the ATLAS calorimeter. It is a relatively small and self-contained package with massive inherent parallelism and captures the essence of GPU offloading via important operations like data transfer, memory initialization, floating point operations, and reduction. Thus, it was identified as...
The LHCb experiment has recently started a new period of data taking after a major upgrade in both software and hardware. One of the biggest challenges has been the migration of the first part of the trigger system (HLT1) into a parallel GPU architecture framework called Allen, which performs a partial reconstruction of most of the LHCb sub-detectors. In Allen, the reconstruction of the...
The recent major upgrade of the ALICE Experiment at CERN’s Large Hadron Collider has been coupled with the development of a new Online-Offline computing system capable of interacting with a sustained input throughput of 3.5TB/s. To facilitate the control of the experiment, new web applications have been developed and deployed to be used 24 hours a day, 365 days a year in the control room and...
The CERN Tape Archive (CTA) was conceived as the successor to CASTOR and as the tape back-end to EOS, designed for the archival storage of data from LHC Run-3 and other experimental programmes at CERN. In the wider WLCG, the tape software landscape is quite heterogenous, but we are now entering a period of consolidation. This has led to a number of sites in WLCG (and beyond) reevaluating their...
We have been studying the use of deep neural networks (DNNs) to identify and locate primary vertices (PVs) in proton-proton collisions at the LHC. Earlier work focused on finding primary vertices in simulated LHCb data using a hybrid approach that started with kernel density estimators (KDEs) derived from the ensemble of charged track parameters heuristically and predicted “target histogram”...
For LHC Run3 the ALICE experiment software stack has been completely refactored, incorporating support for multicore job execution. The new multicore jobs spawn multiple processes and threads within the payload. Given that some of the deployed processes may be short-lived, accounting for their resource consumption presents a challenge. This article presents the newly developed methodology for...
CERN, as many large organizations, relies on multiple communication means for different use-cases and teams.
Email and mailing lists are the most popular ones, but more modern communications systems gain traction such as Mattermost and Push notifications.
On one end of the spectrum we have communication teams writing individual emails to users on a daily basis, which may be small targets, or...
The Deep Underground Neutrino Experiment (DUNE) is a long-baseline experiment which aims to study neutrino oscillation and astroparticle physics. It will produce vast amounts of metadata, which describe the data coming from the read-out of the primary DUNE detectors. Various databases will make up the overall DB architecture for this metadata. ProtoDUNE at CERN is the largest existing...
Research in high energy physics (HEP) heavily relies on domain-specific digital contents. We reflect on the interpretation of principles of Findability, Accessibility, Interoperability, and Reusability (FAIR) in preservation and distribution of such digital objects. As a case study, we demonstrate the implementation of an end-to-end support infrastructure for preserving and accessing Universal...
An all-inclusive analysis of costs for on-premises and public cloud-based solutions to handle the bulk of HEP computing requirements shows that dedicated on-premises deployments are still the most cost-effective. Since the advent of public cloud services, the HEP community has engaged in multiple proofs of concept to study the technical viability of using cloud resources; however, the...
Apache Spark is a distributed computing framework which can process very large datasets using large clusters of servers. Laurelin is a Java-based implementation of ROOT I/O which allows Spark to read and write ROOT files from common HEP storage systems without a dependency on the C++ implementation of ROOT. We discuss improvements due to the migration to an Arrow-based in-memory representation...
The CMS experiment started to utilize Graphics Processing Units (GPU) to accelerate the online reconstruction and event selection running on its High Level Trigger (HLT) farm in the 2022 data taking period. The projections of the HLT farm to the High-Luminosity LHC foresee a significant use of compute accelerators in the LHC Run 4 and onwards in order to keep the cost, size, and power budget...
The sensitivity of modern HEP experiments to New Physics (NP) is limited by the hardware-level triggers used to select data online, resulting in a bias in the data collected. The deployment of efficient data acquisition systems integrated with online processing pipelines is instrumental to increase the experiments' sensitivity to the discovery of any anomaly or possible signal of NP. In...
Quantum Computing (QC) is a promising early-stage technology that offers novel approaches to simulation and analysis in nuclear and high energy physics (NHEP). By basing computations directly on quantum mechanical phenomena, speedups and other advantages for many computationally hard tasks are potentially achievable, albeit both, the theoretical underpinning and the practical realization, are...
No single organisation has the resources to defend its services alone against most modern malicious actors and so we must protect ourselves as a community. In the face of determined and well-resourced attackers, we must actively collaborate in this effort across HEP and more broadly across Research and Education (R&E).
Parallel efforts are necessary to appropriately respond to this...
EvtGen is a simulation generator specialized for decays of heavy hadrons. Since its early development in the 90’s, the generator has been extensively used and has become today an essential tool for heavy-flavour physics analyses. Throughout this time, its source code has remained mostly unchanged, except for additions of new decay models. In view of the upcoming boom of multi-threaded...
The recent release of AwkwardArray 2.0 significantly changes the way that lazy evaluation and task-graph building are handled in columnar analysis. The Dask parallel processing library is now used for these pieces of functionality with AwkwardArray, and this change affords new ways of optimizing columnar analysis and distributing it on clusters. In particular this allows optimization of a task...
We will present the rapid progress, vision and outlook across multiple state of the art development lines within the Global Network Advancement Group and its Data Intensive Sciences and SENSE/AutoGOLE working groups, which are designed to meet the present and future needs and address the challenges of the Large Hadron Collider and other science programs with global reach. Since it was founded...
Large research infrastructures, such as DESY and CERN, in the field of the exploration of the universe and matter (ErUM) are significantly driving the digital transformation of the future. The German action plan "ErUM-Data" promotes this transformation through the interdisciplinary networking and financial support of 20.000 scientists.
The ErUM-Data-Hub (https://erumdatahub.de) serves as a...
The development of an LHC physics analysis involves numerous investigations that require the repeated processing of terabytes of measured and simulated data. Thus, a rapid processing turnaround is beneficial to the scientific process. We identified two bottlenecks in analysis independent algorithms and developed the following solutions.
First, inputs are now cached on individual SSD caches of...
Machine learning (ML) and deep learning (DL) are powerful tools for modeling complex systems. However, most of the standard models in ML/DL do not provide a measure of confidence or uncertainties associated with their predictions. Further, these models can only be trained on available data. During operation, models may encounter data samples poorly reflected in training data. These data...
The ATLAS experiment Data Acquisition (DAQ) system will be extensively upgraded to fully exploit the High-Luminosity LHC (HL-LHC) upgrade, allowing it to record data at unprecedented rates. The detector will be read out at 1 MHz generating over 5 TB/s of data. This design poses significant challenges for the Ethernet-based network as it will be required to transport 20 times more data than...
Rucio is a Data Management software that has become a de-facto standard in the HEP community and beyond. It allows the management of large volumes of data over their full lifecycle. The Belle II experiment located at KEK (Japan) recently moved to Rucio to manage its data over the coming decade (O(10) PB/year). In addition to its Data Management functionalities, Rucio also provides support for...
We explore interpretability of deep neural network (DNN) models designed for identifying jets coming from top quark decay in the high energy proton-proton collisions at the Large Hadron Collider (LHC). Using state-of-the-art methods of explainable AI (XAI), we identify which features play the most important roles in identifying the top jets, how and why feature importance varies across...
The upgrade of the Large Hadron Collider (LHC) is going well, during next decade we will face the ten-fold increase in experimental data. The application of state-of-the-art detectors and data acquisition systems requires high-performance simulation support, which even more demanding in case of heavy ion collisions. Our basic aim was to develop a Monte-Carlo simulation code for heavy ion...
In particle physics, workflow management systems are primarily used as tailored solutions in dedicated areas such as Monte Carlo production. However, physicists performing data analyses are usually required to steer their individual, complex workflows manually, frequently involving job submission in several stages and interaction with distributed storage systems by hand. This process is not...
The NOTED (Network Optimised Transfer of Experimental Data) project has successfully demonstrated the ability to dynamically reconfigure network links to increase the effective bandwidth available for FTS-driven transfers between endpoints, such as WLCG sites by inspecting on-going data transfers and so identifying those that are bandwidth-limited for a long period of time. Recently, the...
Over the last 20 years, thanks to the development of quantum technologies, it has been
possible to deploy quantum algorithms and applications, that before were only
accessible through simulation, on real quantum hardware. The current devices available are often refereed to as noisy intermediate-scale quantum (NISQ) computers and they require
calibration routines in order to obtain...
XAS (synchrotron X-ray absorption spectroscopy) uses X-ray photon energy as a variable to measure the structure of X-ray absorption coefficient that changes with energy. In spectral experiments, the determination of the composition and structure of unknown samples requires data collection first, and then data processing and analysis. It takes a lot of time and there can be no errors in the...
The organization of seminars and conferences was strongly influenced by the covid-19 pandemic.
In the early period of the pandemic, many events were canceled or held completely online, using video conferencing tools such as ZOOM or MS Teams. Later, thanks to large-scale vaccination and immunization, it was possible to organize again large events in the presence. Nevertheless, given some local...
Data analysis in particle physics is socially distributed: unlike centrally developed and executed reconstruction pipelines, the analysis work performed after Analysis Object Descriptions (AODs) are made and before the final paper review—which includes particle and event selection, systematic error handling, decay chain reconstruction, histogram aggregation, fitting, statistical models, and...
The ATLAS experiment is preparing a major change in the conditions data infrastructure in view of Run4 In this presentation we will expose the main motivations for the new design (called CREST for Conditions-REST), the ongoing changes in the DB architecture and present the developments for the deployment of the new system. The main goal is to setup a parallel infrastructure for full scale...
Data caches of various forms have been widely deployed in the context of commercial and research and education networks, but their common positioning at the Edge limits their utility from a network operator perspective. When deployed outside the network core, providers lack visibility to make decisions or apply traffic engineering based on data access patterns and caching node location.
As...
The task of identifying B meson flavor at the primary interaction point in the LHCb detector is crucial for measurements of mixing and time-dependent CP violation.
Flavor tagging is usually done with a small number of expert systems that find important tracks to infer the B flavor from.
Recent advances show that replacing all of those expert systems with one ML algorithm that considers...
Realistic environments for prototyping, studying and improving analysis workflows are a crucial element on the way towards user-friendly physics analysis at HL-LHC scale. The IRIS-HEP Analysis Grand Challenge (AGC) provides such an environment. It defines a scalable and modular analysis task that captures relevant workflow aspects, ranging from large-scale data processing and handling of...
The Quantum Angle Generator (QAG) constitutes a new quantum machine learning model designed to generate accurate images on current Noise Intermediate Scale (NISQ) Quantum devices. Variational quan- tum circuits constitute the core of the QAG model, and various circuit architectures are evaluated. In combination with the so-called MERA- upsampling architecture, the QAG model achieves excellent...
Recently, a workshop on Artificial Intelligence for the Electron Ion Collider (AI4EIC) has been held at the College of William&Mary. The workshop covered all active and potential areas of applications of AI/ML for the EIC; it also had a strong outreach and educational component, with different tutorials given by experts in AI and machine learning from national labs, universities, and industry...
In the near future, the LHC detector will deliver much more data to be processed. Therefore, new techniques are required to deal with such a large amount of data. Recent studies showed that one of the quantum computing techniques, quantum annealing (QA), can be used to perform the particle tracking with efficiency higher than 90% even in the dense environment. The algorithm starts from...
The ALICE experiment at CERN has undergone a substantial detector, readout and software upgrade for the LHC Run3. A signature part of the upgrade is the triggerless detector readout, which necessitates a real time lossy data compression from 1.1TB/s to 100GB/s performed on a GPU/CPU cluster of 250 nodes. To perform this compression, a significant part of the software, which traditionally is...
The growing amount of data generated by the LHC requires a shift in how HEP analysis tasks are approached. Efforts to address this computational challenge have led to the rise of a middle-man software layer, a mixture of simple, effective APIs and fast execution engines underneath. Having common, open and reproducible analysis benchmarks proves beneficial in the development of these modern...
In the LHCb experiment, a wide variety of Monte Carlo simulated samples need to be produced for the experiment’s physics programme. LHCb has a centralised production system for simulating, reconstructing and processing collision data, which runs on the DIRAC backend on the WLCG.
To cope with a large set of different types of sample, requests for simulation production are based on a concept of...
An increasingly frequent challenge faced in HEP data analysis is to characterize the agreement between a prediction that depends on a dozen or more model parameters–such as predictions coming from an effective field theory (EFT) framework–and the observed data. Traditionally, such characterizations take the form of a negative log likelihood (NLL) distribution, which can only be evaluated...
The CMS Submission Infrastructure (SI) is the main computing resource provisioning system for CMS workloads. A number of HTCondor pools are employed to manage this infrastructure, which aggregates geographically distributed resources from the WLCG and other providers. Historically, the model of authentication among the diverse components of this infrastructure has relied on the Grid Security...
With the construction and operation of fourth-generation light sources like European Synchrotron Radiation Facility Extremely Brilliant Source (ESRF-EBS), Advanced Photon Source Upgrade (APS-U), Advanced Light Source Upgrade (ALS-U), High Energy Photon Source (HEPS), etc., several advanced biological macromolecule crystallography (MX) beamlines are or will be built and thereby the huge amount...
To meet the computing challenges of upcoming experiments, software training efforts play an essential role in imparting best practices and popularizing new technologies. Because many of the taught skills are experiment-independent, the HSF/IRIS-HEP training group coordinates between different training initiatives while building a training center that provides students with various training...
The Superconducting Quantum Materials and Systems Center (SQMS) and the Computational Science and AI Directorate (CSAID) at Fermi National Accelerator Laboratory and Rigetti Computing have teamed up to define and deliver a standard pathway for quantum computing at Rigetti from HEPCloud. HEPCloud now provides common infrastructure and interfacing for managing connectivity and providing access...
Searches for new physics set exclusion limits in parameter spaces of typically up to 2 dimensions. However, the relevant theory parameter space is usually of a higher dimension but only a subspace is covered due to the computing time requirements of signal process simulations. An Active Learning approach is presented to address this limitation. Compared to the usual grid sampling, it reduces...
The Deep Underground Neutrino Experiment (DUNE) is a next generation long-baseline neutrino experiment based in the USA which is expected to start taking data in 2029. DUNE aims to precisely measure neutrino oscillation parameters by detecting neutrinos from the LBNF beamline (Fermilab) at the Far Detector, 1300 kilometres away, in South Dakota. The Far Detector will consist of four cryogenic...
A comprehensive analysis of the HEP (High Energy Physics) experiment traffic across LHCONE (Large Hadron Collider Open Network Environment) and other networks, is essential for immediate network optimisation (for example by the NOTED project) and highly desirable for long-term network planning. Such an analysis requires two steps: tagging of network packets to indicate the type and owner of...
Abstract
PyPWA is a toolkit designed to fit (regression) parametric models to data and to generate distributions (simulation) according to a given model (function). PyPWA software has been written under the python ecosystem with the goal of performing Amplitude or Partial Wave Analysis (PWA) in nuclear and particle physics experiments. The aim of spectroscopy experiments is often the...
The HSF Conditions Databases activity is a forum for cross-experiment discussions hoping for as broad a participation as possible. It grew out of the HSF Community White Paper work to study conditions data access, where experts from ATLAS, Belle II, and CMS converged on a common language and proposed a schema that represents best practice. The focus of the HSF work is the most difficult use...
The continuous growth in model complexity in high-energy physics (HEP) collider experiments demands increasingly time-consuming model fits. We show first results on the application of conditional invertible networks (cINNs) to this challenge. Specifically, we construct and train a cINN to learn the mapping from signal strength modifiers to observables and its inverse. The resulting network...
The study of the decays of $B$ mesons is a key component of modern experiments which probe heavy quark mixing and $CP$ violation, and may lead to concrete deviations from the predictions of the Standard Model [1]. Flavour tagging, the process of determining the quark flavour composition of $B$ mesons created in entangled pairs at particle accelerators, is an essential component of this...
Monte Carlo simulations are a key tool for the physics program of High Energy Experiments. Their accuracy and reliability is of the utmost importance. A full suite of verifications is in place for the LHCb Simulation software to ensure the quality of the simulated samples produced.
In this contribution we will give a short overview of the procedure and the tests in place, that exploits the...
Geant4, the leading detector simulation toolkit used in High Energy Physics, employs a set of physics models to simulate interactions of particles with matter across a wide range of interaction energies. These models, especially the hadronic ones, rely largely on directly measured cross-sections and inclusive characteristics, and use physically motivated parameters. However, they generally aim...
Large-scale research facilities are becoming prevalent in the modern scientific landscape. One of these facilities' primary responsibilities is to make sure that users can process and analysis measurement data for publication. To allow for barrier-less access to those highly complex experiments, almost all beamlines require fast feedback capable of manipulating and visualizing data online to...
The ATLAS EventIndex is a global catalogue of the events collected, processed or generated by the ATLAS experiment. The system was upgraded in advance of LHC Run 3, with a migration of the Run 1 and Run 2 data from HDFS MapFiles to HBase tables with a Phoenix interface. The frameworks for testing functionality and performance of the new system have been developed. There are two types of tests...
Deep underground, the removal of rock to fashion three soccer field
sized caverns is underway, as are detector prototypings. In 2024, the
first DUNE far detector will be constructed as a large cryostat,
instrumented as a traditional tracking calorimeter but in a cold bath of
zenon doped liquidized argon. An Epic Game UnReal Engine rendered 3D
simulation of the underground laboratory has...
ALICE has upgraded many of its detectors for LHC Run 3 to operate in continuous readout mode recording Pb-Pb collisions at 50 kHz interaction rate without trigger.
This results in the need to process data in real time at rates 50 times higher than during Run 2. In order to tackle such a challenge we introduced O2, a new computing system and the associated infrastructure. Designed and...
Streaming Readout Data Acquisition systems coupled with distributed resources spread over vast geographic distances present new challenges to the next generation of experiments. High bandwidth modern network connectivity opens the possibility to utilize large, general-use, HTC systems that are not necessarily located close to the experiment. Near real-time response rates and workflow...
As nuclear physics collaborations and experiments increase in size, the data management and software practices in this community have changed as well. Large nuclear physics experiments at Brookhaven National Lab (STAR, PHENIX, sPHENIX), at Jefferson Lab (GlueX, CLAS12, MOLLER), and at the Electron-Ion Collider (ePIC) are taking different approaches to data management, building on existing...
This presentation will cover the content of the report delivered by the Snowmass computational Frontier late in 2022. A description of the frontier organization and various preparatory events, including the Seattle Community Summer Study (CSS), will be followed by a discussion on the evolution of computing hardware and the impact of newly established and emerging technologies, including...
The U.S. Nuclear Physics community has been conducting long-range planning (LRP) for nuclear science since late 1970s. The process is known as the Nuclear Science Advisory Committee (NSAC) LRP with NSAC being an advisory body jointly appointed by the U.S. Department of Energy and the U.S. National Science Foundation. The last NSAC LRP was completed in 2015 and the current NSAC LRP is ongoing...
Today's students are tomorrow's leaders in science, technology, engineering and math. To ensure the best minds reach their potential tomorrow, it's vital to ensure that students not only experience meaningful STEM learning today, but also have the opportunities and support to pursue careers in a STEM environment that is more welcoming, inclusive and just. This panel will feature expertise from...
The analysis category was introduced in Geant4 almost ten years ago (in 2014) with the aim to provide users with a lightweight analysis tool, available as part of the Geant4 installation without the need to link to an external analysis package. It helps capture statistical data in the form of histograms and n-tuples and store these in files in four various formats. It was already presented at...
High energy physics experiments are pushing forward the precision measurements and searching for new physics beyond standard model. It is urgent to simulate and generate mass data to meet requirements from physics. It is one of the most popular areas to make good use of existing power of supercomputers for high energy physics computing. Taking the BESIII experiment as an illustration, we...
To increase the science rate for high data rates/volumes, Thomas Jefferson National Accelerator Facility (JLab) has partnered with Energy Sciences Network (ESnet) to define an edge to data center traffic shaping/steering transport capability featuring data event-aware network shaping and forwarding.
The keystone of this ESnet JLab FPGA Accelerated Transport (EJFAT) is the joint development...
We present tools for high-performance analysis written in pure Julia, a just-in-time (JIT) compiled dynamic programming language with a high-level syntax and performance. The packages we present center around UnROOT.jl, a pure Julia ROOT file I/O package that is optimized for speed, lazy reading, flexibility, and thread safety.
We discuss what affects performance in Julia, the challenges,...
EPOS 4 is the last version of the high-energy collision event generator EPOS, released publicly in 2022. It was delivered with improvements on several aspects, whether about the theoretical bases on which it relies, how they are handled technically, or regarding user's interface and data compatibility.
This last point is especially important, as part of a commitment to provide the widest...
Machine learning (ML) has become ubiquitous in high energy physics (HEP) for many tasks, including classification, regression, reconstruction, and simulations. To facilitate development in this area, and to make such research more accessible, and reproducible, we require standard, easy-to-access, datasets and metrics. To this end, we develop the open source Python JetNet library with easily...
The Data Lake concept has promised increased value to science and more efficient operations for storage compared to the traditional isolated storage deployments. Building on the established distributed dCache serving as the Nordic Tier-1 storage for LHC data, we have also integrated tier-2 pledged storage in Slovenia, Sweden, and Switzerland, resulting in a coherent storage space well above...
Communicating the science and achievements of the ATLAS Experiment is a core objective of the ATLAS Collaboration. This talk will explore the range of communication strategies adopted in ATLAS communications, with particular focus on how these have been impacted by the COVID-19 pandemic. In particular, an overview of ATLAS’ digital communication platforms will be given – with focus on social...
NA62 is a K meson physics experiment based on a decay-in-flight technique and whose Trigger and Data Acquisition system (TDAQ) is multi-level and network based. A reorganization of both the beam line and the detector is foreseen in the next years to complete and extend the physics reach of NA62. One of the challenging aspects of this upgrade is a significant increase (x4) in the event rate...
The newly formed EPIC Collaboration has recently laid the foundations of its software infrastructure. Noticeably, several forward-looking aspects of the software are favorable for Artificial Intelligence (AI) and Machine Learning (ML) applications and utilization of heterogeneous resources. EPIC has a unique opportunity to integrate AI/ML from the beginning: the number of AI/ML activities is...
We will describe how ServiceX, an IRIS-HEP project, generates C++ or python code from user queries and orchestrates thousands of experiment-provided docker containers to filter and select event data. The source datafiles are identified using Rucio. We will show how the service encapsulates best practice for using Rucio and helps inexperienced analysers get up to speed quickly. The data is...
For the new Geant4 series 11.X electromagnetic (EM) physics sub-libraries were revised and reorganized in view of requirements for simulation of Phase-2 LHC experiments. EM physics simulation software takes a significant part of CPU during massive production of Monte Carlo events for LHC experiments. We present recent evolution of Geant4 EM sub-libraries for simulation of gamma, electron, and...
A mechanism to store in databases all the parameters needed to simulate the detectors response to physics interactions is presented. This includes geometry, materials, magnetic field, electronics.
GEMC includes a python API to populate the databases, and the software to run the Monte-Carlo simulation. The engine is written in C++ and uses Geant4 for the passage of particles through...
Rucio, the data management software initially developed for ATLAS, has been in use at Belle II since January 2021. After the transition to Rucio, new features and functionality were implemented in Belle II grid tools based on Rucio, to improve the experience of grid users. The container structure in the Rucio File Catalog enabled us to define collections of arbitrary datasets, allowing the...
The CMS experiment is working to integrate an increasing number of High Performance Computing (HPC) resources into its distributed computing infrastructure. The case of the Barcelona Supercomputing Center (BSC) is particularly challenging as severe network restrictions prevent the use of CMS standard computing solutions. The CIEMAT CMS group has performed significant work in order to overcome...
Field Programmable Gate Arrays (FPGAs) are playing an increasingly important role in the sampling and data processing industry due to their intrinsically highly parallel architecture, low power consumption, and flexibility to execute custom algorithms. In particular, the use of FPGAs to perform machine learning inference is increasingly growing thanks to the development of high-level synthesis...
China’s High Energy Photon Source (HEPS), the first national high-energy synchrotron radiation light source and soon one of the world’s brightest fourth-generation synchrotron radiation facilities, is being under intense construction in Beijing’s Huairou District, and will be completed in 2025.
To make sure that the huge amount of data collected at HEPS is accurate, available and...
The International Particle Physics Outreach Group (IPPOG) is a network of scientists, science educators and communication specialists working across the globe in informal science education and public engagement for particle physics. The primary methodology adopted by IPPOG includes the direct participation of scientists active in current research with education and communication specialists,...
Awkward Arrays is a library for performing NumPy-like computations on nested, variable-sized data, enabling array-oriented programming on arbitrary data structures in Python. However, imperative (procedural) solutions can sometimes be easier to write or faster to run. Performant imperative programming requires compilation; JIT-compilation makes it convenient to compile in an interactive Python...
A critical challenge of performing data transfers or remote reads is to be fast and efficient as possible while, at the same time, keeping the usage of system resources as low as possible. Ideally, the software that manages these data transfers should be able to organize them so that one can have them run up to the hardware limits. Significant portions of LHC analysis use the same datasets,...
The recent developments in ROOT/TMVA focus on fast machine learning inference, which enables analysts to deploy their machine learning models rapidly on large scale datasets. A new tool has been recently developed, SOFIE, allowing for generating C++ code for evaluation of deep learning models, which are trained from external tools such as Tensorflow or PyTorch.
While Python-based deep...
The [Vera C. Rubin observatory][1] is preparing for execution of the most ambitious astronomical survey ever attempted, the Legacy Survey of Space and Time (LSST). Currently in its final phase of construction in the Andes mountains in Chile and due to start operations late 2024 for 10 years, its 8.4-meter telescope will nightly scan the southern sky and collect images of the entire visible sky...
The High-Luminosity LHC will open an unprecedented window on the weak-scale nature of the universe, providing high-precision measurements of the standard model as well as searches for new physics beyond the standard model. Such precision measurements and searches require information-rich datasets with a statistical power that matches the high-luminosity provided by the Phase-2 upgrade of the...
Computing demands for large scientific experiments, such as the CMS experiment at CERN, will increase dramatically in the next decades. To complement the future performance increases of software running on CPUs, explorations of coprocessor usage in data processing hold great potential and interest. We explore the novel approach of Services for Optimized Network Inference on Coprocessors...
Sudhir Malik, Peter Elmer, Adam LaMee, Ken Cecire
The NSF-funded IRIS-HEP "Training, Education & Outreach" program and QuarkNet are partnering to enable and expand software training for the high school teachers with a goal to tap, grow and diversify the talent pipeline from K-12 students for future cyberinfrastructure. IRIS-HEP (https://iris-hep.org/) is a software institute that aims to...
The LHCb software has undergone a major upgrade in view of data taking with higher luminosity in Run3 of the LHC at CERN.
The LHCb simulation framework, Gauss, had to be adapted to follow the changes in modern technologies of the underlying experiment core software and to introduce new simulation techniques to cope with the increase of the required amount of simulated data. Additional...
Neural Networks (NN) are often trained offline on large datasets and deployed on specialized hardware for inference, with a strict separation between training and inference. However, in many realistic applications the training environment differs from the real world or data arrive in a streaming fashion and are continuously changing. In these scenarios, the ability to continuously train and...
ALICE is one of the four large experiments at the CERN LHC designed to study the structure and origins of matter in collisions of heavy ions (and protons) at ultra-relativistic energies. The experiment measures the particles produced as a result of collisions in its center so that it can reconstruct and study the evolution of the system produced during these collisions. To perform these...
In the past years the landscape of tools for expressing parallel algorithms in a portable way across various compute accelerators has continued to evolve significantly. There are many technologies on the market that provide portability between CPU, GPUs from several vendors, and in some cases even FPGAs. These technologies include C++ libraries such as Alpaka and Kokkos, compiler directives...
Gaussino is a new simulation experiment-independent framework based on the Gaudi data processing framework. It provides generic core components and interfaces to build a complete simulation application: generation, detector simulation, geometry, monitoring, and saving of the simulated data. Thanks to its highly configurable and extendable components Gaussino can be used both as a toolkit and a...
The NSF-funded Scalable CyberInfrastructure for Artificial Intelligence and Likelihood Free Inference (SCAILFIN) project has developed and deployed artificial intelligence (AI) and likelihood-free inference (LFI) techniques and software using scalable cyberinfrastructure (CI) built on top of existing CI elements. Specifically, the project has extended the CERN-based REANA framework, a...
MoEDAL (the Monopole and Exotics Detector at the LHC) searches directly magnetic monopoles at the Interaction Point 8 of the Large Hadron Collider (LHC). As an upgrade of the experiment an addition, MAPP (MoEDAL Apparatus for Penetrating Particles) detector extends the physics reach by providing sensitivity to milli-charged and long-lived exotic particles. The MAPP detectors are scintillator...
In particle physics, data analysis frequently needs variable-length, nested data structures such as arbitrary numbers of particles per event and combinatorial operations to search for particle decay. Arrays of these data types are provided by the Awkward Array library.
The previous version of this library was implemented in C++, but this impeded its ability to grow. Thus, driven by this...
UKRI/STFC’s Scientific Computing Department (SCD) runs a vibrant range of computing related public engagement activities. We benefit form the work done by the National Labs public engagement team to develop a well articulated PE strategy, and an accompanying evaluation framework, including the idea of defining formal generic learning outcomes (GLOs).
This paper presents how this combination...
In recent years, advanced and complex analysis workflows have gained increasing importance in the ATLAS experiment at CERN, one of the large scientific experiments at the Large Hadron Collider (LHC). Support for such workflows has allowed users to exploit remote computing resources and service providers distributed worldwide, overcoming limitations on local resources and services. The spectrum...
The Virtual Visit service run by the ATLAS Collaboration has been active since 2010. The ATLAS Collaboration has used this popular and effective method to bring the excitement of scientific exploration and discovery into classrooms and other public places around the world. The programme, which uses a combination of video conferencing, webcasts, and video recording to communicate with remote...
Recent developments of HEP software allow novel approaches to physics analysis workflows. The novel data delivery system, ServiceX, can be very effective when accessing a fraction of large datasets at remote grid sites. ServiceX can deliver user-selected columns with filtering and run at scale. We will introduce the ServiceX data management package, ServiceX DataBinder, for easy manipulations...
The findable, accessible, interoperable, and reusable (FAIR) data principles have provided a framework for examining, evaluating, and improving how we share data with the aim of facilitating scientific discovery. Efforts have been made to generalize these principles to research software and other digital products. Artificial intelligence (AI) models---algorithms that have been trained on data...
We report the implementation details, commissioning results, and physics performances of a two-dimensional cluster finder for reconstructing hit positions in the new vertex pixel detector (VELO) that is part of the LHCb Upgrade. The associated custom VHDL firmware has been deployed to the existing FPGA cards that perform the readout of the VELO and fully commissioned during the start of LHCb...
The File Transfer System (FTS) is a software system responsible for queuing, scheduling, dispatching and retrying file transfer requests, it is used by three of the LHC experiments, namely ATLAS, CMS and LHCb, as well as non LHC experiments including AMS, Dune and NA62. FTS is critical to the success of many experiments and the service must remain available and performant during the entire...
LCIO is a persistency framework and event data model originally developed to foster closer collaboration among the international groups conducting simulation studies for future linear colliders. In the twenty years since its introduction at CHEP 2003 it has formed the backbone for ILC and CLIC physics and detector studies. It has also been successfully employed to study and develop other...
LHCb (Large Hadron Collider beauty) is one of the four large particle physics experiments aimed at studying differences between particles and anti-particles and very rare decays in the charm and beauty sector of the standard model at the LHC. The Experiment Control System (ECS) is in charge of the configuration, control, and monitoring of the various subdetectors as well as all areas of the...
FullSimLight is a lightweight, Geant4-based command line
simulation utility intended for studies of simulation performance. It
is part of the GeoModel toolkit (geomodel.web.cern.ch) which has been
stable for more than one year. The FullSimLight component
has recently undergone renewed development aimed at extending its
functionality. It has been endowed with a GUI for fast,...
The computing resources supporting the LHC experiments research programmes are still dominated by x86 processors deployed at WLCG sites. This will however evolve in the coming years, as a growing number of HPC and Cloud facilities will be employed by the collaborations in order to process the vast amounts of data to be collected in the LHC Run 3 and into the HL-LHC phase. Compute power in...
The CMS data acquisition (DAQ) is implemented as a service-oriented architecture where DAQ applications, as well as general applications such as monitoring and error reporting, are run as self-contained services. The task of deployment and operation of services is achieved by using several heterogeneous facilities, custom configuration data and scripts in several languages. Deployment of all...
High Energy Physics (HEP) Trigger and Data Acquisition systems (TDAQs) need ever increasing throughput and real-time data analytics capabilities either to improve particle identification accuracy and further suppress background events in trigger systems or to perform an efficient online data reduction for trigger-less ones.
As for the requirements imposed by HEP TDAQs applications in the...
Modern HEP workflows must manage increasingly large and complex data collections. HPC facilities may be employed to help meet these workflows' growing data processing needs. However, a better understanding of the I/O patterns and underlying bottlenecks of these workflows is necessary to meet the performance expectations of HPC systems.
Darshan is a lightweight I/O characterization tool that...
HPC systems are increasingly often used for addressing various challenges in high-energy physics. But often the data infrastructures used in the latter area are not well integrated with infrastructures that include HPC resources. Here we will focus on a specific infrastructure, namely Fenix, which is based on a consortium of 6 leading European supercomputing centres. The Fenix sites are...
Cloudscheduler is a system to manage resources of local and remote compute clouds and makes those resources available to HTCondor pools. It examines the resource needs of idle jobs, then starts virtual machines (VMs) sized to suit those resource needs on allowed clouds with available resources. Using yaml files, cloudscheduler then provisions the VMs during the boot process with all necessary...
The ATLAS Trigger and Data Acquisition (TDAQ) High Level Trigger (HLT) computing farm contains 120,000 cores. These resources are critical for online selection and collection of collision data in the ATLAS experiment during LHC operation. Since 2013, during longer period of LHC inactivity these resources are being used for offline event simulation via the "Simulation at Point One" project...
The high-energy physics community is investigating the feasibility of deploying more machine-learning-based solutions on FPGAs to meet modern physics experiments' sensitivity and latency demands. In this contribution, we introduce a novel end-to-end procedure that utilises a forgotten method in machine learning, i.e. symbolic regression (SR). It searches equation space to discover algebraic...
The Liquid Argon Calorimeters are employed by ATLAS for all electromagnetic calorimetry in the pseudo-rapidity region |η| < 3.2, and for hadronic and forward calorimetry in the region from |η| = 1.5 to |η| = 4.9. They also provide inputs to the first level of the ATLAS trigger. After successful period of data taking during the LHC Run-2 between 2015 and 2018 the ATLAS detector entered into the...
During Run 2 the ATLAS experiment employed a large number of different user frameworks to perform the final corrections of its event data. For Run 3 a common framework was developed that incorporates the lessons learned from existing frameworks. Besides providing analysis standardization it also incorporates optimizations that lead to a substantial reduction in computing needs during analysis.
The rapid growth of scientific data and the computational needs of BNL-supported science programs will bring the Scientific Data and Computing Center (SDCC) to the Exabyte scale in the next few years. The SDCC Storage team is responsible for the symbiotic development and operations of storage services for all BNL experiment data, in particular for the data generated by the ATLAS experiment...
For HEP event processing, data is typically stored in column-wise synchronized containers, such as most prominently ROOT’s TTree, which have been used for several decades to store by now over 1 exabyte. These containers can combine row-wise association capabilities needed by most HEP event processing frameworks (e.g. Athena for ATLAS) with column-wise storage, which typically results in better...
The first stage of the LHCb High Level Trigger is implemented as a GPU application. In 2023 it will run on 400 NVIDIA GPUs and its goal is to reduce the rate of incoming data from 5 TB/s to approximately 100 GB/s. A broad scala of reconstruction algorithms is implemented as approximately 70 kernels. Machine Learning algorithms are attractive to further extend the physics reach of the...
The CMSWEB cluster is pivotal to the activities of the Compact Muon Solenoid (CMS) experiment, as it hosts critical services required for the operational needs of the CMS experiment. The security of these services and the corresponding data is crucial to CMS. Any malicious attack can compromise the availability of our services. Therefore, it is important to construct a robust security...
The Vera C. Rubin Observatory will produce an unprecedented astronomical data set for studies of the deep and dynamic universe. Its Legacy Survey of Space and Time (LSST) will image the entire southern sky every three days and produce tens of petabytes of raw image data and associated calibration data. More than 20 terabytes of data must be processed and stored every night for ten...
The Large Hadron Collider (LHC) at CERN is the largest and most powerful particle collider today. The Phase-II Upgrade of the LHC will increase the instantaneous luminosity by a factor of 7 leading to the High Luminosity LHC (HL-LHC). At the HL-LHC, the number of proton-proton collisions in one bunch crossing (called pileup) increases significantly, putting more stringent requirements on the...
HEP data-processing frameworks are essential ingredients in getting from raw data to physics results. But they are often tricky to use well, and they present a significant learning barrier for the beginning HEP physicist. In addition, existing frameworks typically support rigid, collider-based data models, which do not map well to neutrino-physics experiments like DUNE. Neutrino physicists...
Reliably simulating detector response to hadrons is crucial for almost all physics programs at the Large Hadron Collider. The core component of such simulation is the modeling of hadronic interactions. Unfortunately, there is no first-principle theory guidance. The current state-of-the-art simulation tool, Geant4, exploits phenomenology-inspired parametric models, each simulating a specific...
The data-taking conditions expected of Run 3 pose unprecedented challenges for the DAQ systems of the LHCb experiment at the LHC. The LHCb collaboration is pioneering the adoption of a fully-software trigger to cope with the expected increase in luminosity and, thus, event rate. The upgraded trigger system has required advances in the use of hardware architectures, software and algorithms....
In the HEP community, the prediction of Data Popularity is a topic that has been approached for many years. Nonetheless, while facing increasing data storage challenges, especially in the HL-LHC era, we are still in need for better predictive models to answer the questions of whether particular data should be kept, replicated, or deleted.
The usage of caches proved to be a convenient...
In mainstream machine learning, transformers are gaining widespread usage. As Vision Transformers rise in popularity in computer vision, they now aim to tackle a wide variety of machine learning applications. In particular, transformers for High Energy Physics (HEP) experiments continue to be investigated for tasks including jet tagging, particle reconstruction, and pile-up mitigation.
In a...
The full simulation of particle colliders incurs a significant computational cost. Among the most resource-intensive steps are detector simulations. It is expected that future developments, such as higher collider luminosities and highly granular calorimeters, will increase the computational resource requirement for simulation beyond availability. One possible solution is generative neural...
The increased footprint foreseen for Run-3 and HL-LHC data will soon expose
the limits of currently available storage and CPU resources. Data formats
are already optimized according to the processing chain for which they are
designed. ATLAS events are stored in ROOT-based reconstruction output files
called Analysis Object Data (AOD), which are then processed within the
derivation...
ATLAS is one of the main experiments at the Large Hadron Collider, with a diverse physics program covering precision measurements as well as new physics searches in countless final states, carried out by more than 2600 active authors. The High Luminosity LHC (HL-LHC) era brings unprecedented computing challenges that call for novel approaches to reduce the amount of data and MC that is stored,...
The Belle II software was developed as closed source. As several HEP experiments released their code to the public, the topic of open source software was also discussed within the Belle II collaboration. A task force analyzed advantages and disadvantages and proposed a policy which was adopted by the collaboration in 2020. The Belle II offline software was then released under an open source...
The Vera C. Rubin Observatory, currently in construction in Chile, will start performing the Large Survey of Space and Time (LSST) late 2024 for 10 years. Its 8.4-meter telescope will survey the southern sky in less than 4 nights in six optical bands, and repeatedly generate about 2000 exposures per night, corresponding to a data volume of about 20 TB every night. Three data facilities are...
The centralised Elasticsearch service has already been running at CERN for over 6 years, providing the search and analytics engine for numerous CERN users. The service has been based on the open-source version of Elasticsearch, surrounded by a set of external open-source plugins offering security, multi-tenancy, extra visualization types and more. The evaluation of OpenDistro for...
Current analysis at ATLAS always involves a step of producing an analysis-specific n-tuple that incorporates the final step of calibrations as well as systematic variations. The goal for Run 4 is to make these analysis-specific corrections fast enough that they can be applied "on-the-fly" without the need for an intermediate n-tuple. The main complications for this are that some of these...
A new era of hadron collisions will start around 2029 with the High-Luminosity LHC which will allow to collect ten times more data than what has been collected during 10 years of operation at LHC. This will be achieved by higher instantaneous luminosity at the price of higher number of collisions per bunch crossing.
In order to withstand the high expected radiation doses and the harsher...
Detector studies for future experiments rely on advanced software
tools to estimate performance and optimize their design and technology
choices. The Key4hep project provides a flexible turnkey solution for
the full experiment life-cycle based on established community tools
such as ROOT, Geant4, DD4hep, Gaudi, podio and spack. Members of the
CEPC, CLIC, EIC, FCC, and ILC communities have...
Complete and reliable monitoring of the WLCG data transfers is an important condition for effective computing operations of the LHC experiments. WLCG data challenges organised in 2021 and 2022 highlighted the need for improvements in the monitoring of data traffic on the WLCG infrastructure. In particular, it concerns the implementation of the monitoring of the remote data access via the...
For High Energy Physics (HEP) experiments, the calorimeter is a key detector to measure the energy of particles. Particles interact with the material of the calorimeter, creating cascades of secondary particles, the so-called showers. Description of the showering process relies on simulation methods that precisely describe all particle interactions with matter. Constrained by the complexity of...
The High-Luminosity LHC upgrade of the CMS experiment will utilise a large number of Machine Learning (ML) based algorithms in its hardware-based trigger. These ML algorithms will facilitate the selection of potentially interesting events for storage and offline analysis. Strict latency and resource requirements limit the size and complexity of these models due to their use in a high-speed...
A new theoretical framework in Quantum Machine Learning (QML) allows to compare the performances of Quantum and Classical ML models on supervised learning tasks. We assess the performance of a quantum and classic support vector machine for a High Energy Physics dataset: the Higgs tt ̄H(b ̄b) decay dataset, grounding our study in a new theoretical framework based on three metrics: the geometry...
With the increased data volumes expected to be delivered by the HL-LHC, it becomes critical for the ATLAS experiment to maximize the utilization of available computing resources ranging from conventional GRID clusters to supercomputers and cloud computing platforms. To be able to run its data processing applications on these resources, the ATLAS software framework must be capable of...
Due to the increased demand of network traffic expected during the HL-LHC era, the T2 sites in the USA will be required to have 400Gbps of available bandwidth to their storage solution.
With the above in mind we are pursuing a scale test of XRootD software when used to perform Third Party Copy transfers using the HTTP protocol. Our main objective is to understand the possible limitations in...
Since March 2019 the Belle II detector has collected data from e+ e- collisions at the SuperKEKB collider. For Belle II analyses to be competitive it is crucial that calibration constants are calculated promptly so that the reconstructed datasets can be provided to analysts. A subset of calibration constants also benefit by being re-derived during yearly recalibration campaigns to give...
Free energy-based reinforcement learning (FERL) with clamped quantum Boltzmann machines (QBM) was shown to significantly improve the learning efficiency compared to classical Q-learning with the restriction, however, to discrete state-action space environments. We extended FERL to continuous state-action space environments by developing a hybrid actor-critic scheme combining a classical...
For the Belle II experiment, the electromagnetic calorimeter (ECL) plays a crucial role in both the trigger decisions and the offline analysis.
The performance of existing clustering algorithms degrades with rising backgrounds that are expected for the increasing luminosity in Belle II. In offline analyses, this mostly impacts the energy resolution for low-energy photons; for the trigger,...
Open Data Detector (ODD) is a detector for algorithm research and development. The tracking system is an evolution of the detector used in the successful Tracking Machine Learning Challenge. It offers a more realistic design, with a support structure, cables, and cooling pipes. The ODD got extended with granular calorimetry and can be completed in future with a muon system. The magnetic field...
In a decade from now, the Upgrade II of LHCb experiment will face an instantaneous luminosity ten times higher than in the current Run 3 conditions. This will bring LHCb to a new era, with huge event sizes and typically several signal heavy-hadron decays per event. The trigger scope will shift from selecting interesting events to select interesting parts of multi-signal events. To allow for an...
With the emergence of the research field Quantum Machine Learning, interest in finding advantageous real-world applications is growing as well.
However challenges concerning the number of available qubits on Noisy Intermediate Scale Quantum (NISQ) devices and accuracy losses due to hardware imperfections still remain and limit the applicability of such approaches in real-world scenarios.
For...
In preparation for the second runs of the ProtoDUNE detectors at CERN (NP02 and NP04), DUNE has established a new data pipeline for bringing the data from the EHN-1 experimental hall at CERN to primary tape storage at Fermilab and CERN, and then spreading it out to a distributed disk data store at many locations around the world. This system includes a new Ingest Daemon and a new Declaration...
The volume and complexity of data produced at HEP and NP research facilities have grown exponentially. There is an increased demand for new approaches to process data in near-real-time. In addition, existing data processing architectures need to be reevaluated to see if they are adaptable to new technologies. A unified programming model for event processing and distribution that can exploit...
In the last two decades, there have been major breakthroughs in Quantum Chromodynamics (QCD), the theory of the strong interaction of quark and gluons, as well as major advances in the accelerator and detector technologies that allow us to map the spatial distribution and motion of the quarks and gluons in terms of quantum correlation functions (QCF). This field of research broadly known as...
The search for exotic long-lived particles (LLP) is a key area of the current LHC physics programme and is expected to remain so into the High-Luminosity (HL)-LHC era. As in many areas of the LHC physics programme Machine Learning algorithms play a crucial role in this area, in particular Deep Neural Networks (DNN), which are able to use large numbers of low-level features to achieve enhanced...
The ALICE Grid is designed to perform a realtime comprehensive monitoring of both jobs and execution nodes in order to maintain a continuous and consistent status of the Grid infrastructure. An extensive database of historical data is available and is periodically analyzed to tune the workflow and data management to optimal performance levels. This data, when evaluated in real time, has the...
The LArSoft/art framework is used at Fermilab’s liquid argon time projection chamber experiments such as ICARUS to run traditional production workflows in a grid environment. It has become increasingly important to utilize HPC facilities for experimental data processing tasks. As part of the SciDAC-4 HEP Data Analytics on HPC and HEP Event Reconstruction with Cutting Edge Computing...
Particle tracking is among the most sophisticated and complex part of the full event reconstruction chain. A number of reconstruction algorithms work in a sequence to build these trajectories from detector hits. Each of these algorithms use many configuration parameters that need to be fine-tuned to properly account for the detector/experimental setup, the available CPU budget and the desired...
At the CMS experiment, a growing reliance on the fast Monte Carlo application (FastSim) will accompany the high luminosity and detector granularity expected in Phase 2. The FastSim chain is roughly 10 times faster than the application based on the GEANT4 detector simulation and full reconstruction referred to as FullSim. However, this advantage comes at the price of decreased accuracy in some...
In the ideal world, we describe our models with recognizable mathematical expressions and directly fit those models to large data sample with high performance. It turns out that this can be done with a CAS, using its symbolic expression trees as template to computational back-ends like JAX. The CAS can in fact further simplify the expression tree, which can result in speed-ups in the numerical...
In the search for advantage in Quantum Machine Learning, appropriate encoding of the underlying classical data into a quantum state is a critical step. Our previous work [1] implemented an encoding method inspired by underlying physics and achieved AUC scores higher than that of classical techniques for a reduced dataset of B Meson Continuum Suppression data. A particular problem faced by...
To improve the potential for discoveries at the LHC, a significant luminosity increase of the accelerator (HL-LHC) is foreseen in the late 2020s, to achieve a peak luminosity of 7.5x10^34 cm^-2 s^-1, in the ultimate performance scenario. HL-LHC is expected to run with a bunch-crossing separation of 25 ns and a maximum average of 200 events (pile-up) per crossing. To maintain or even improve...
The nature and origin of dark matter are among the most compelling mysteries of contemporary science. There is strong evidence for dark matter from its role in shaping the galaxies and galaxy clusters that we observe in the universe. Still, physicists have tried to detect dark matter particles for over three decades with little success.
This talk will describe the leading effort in that...
The increasing volumes of data produced at light sources such as the Linac Coherent Light Source (LCLS) enable the direct observation of materials and molecular assemblies at the length and timescales of molecular and atomic motion. This exponential increase in the scale and speed of data production is prohibitive to traditional analysis workflows that rely on scientists tuning parameters...
The astronomy world is moving towards exascale observatories and experiments, with data distribution and access challenges that are comparable with - and on similar timescales to - those of HL-LHC. I will present some of these use cases and show progress towards prototyping work in the Square Kilometre Array (SKA) Regional Centre Network, and present the conceptual architectural view of the...
Timepix4 is a hybrid pixel detector readout ASIC developed by the Medipix4 Collaboration. It consists of a matrix of about 230 k pixels with 55 micron pitch, equipped with amplifier, discriminator and time-to-digital converter with 195 ps bin size that allows to measure time-of-arrival and time-over-threshold. It is equipped with two different types of links: slow control links, for the...
Delphes is a C++ framework to perform a fast multipurpose detector response simulation. The Circular Electron Positron Collider (CEPC) experiment runs fast simulation with a modified Delphes based on its own scientific objectives. The CEPC fast simulation with Delphes is a High Throughput Computing (HTC) application with small input and output files. Besides, to compile and run Delphes, only...
Slurm REST APIs are released since version 20.02. With those REST APIs one can interact with slurmctld and slurmdbd daemons in a RESTful way. As a result, job submission and cluster status query can be achieved with a web system. To take advantage of Slurm REST APIs, a web workbench system is developed for the Slurm cluster at IHEP.
The workbench system consists with four subsystems:...
Documentation on all things computing is vital for an evolving collaboration of scientists, technicians, and students. Using the MediaWiki software as the framework, a searchable knowledge base is provided via a secure web interface to the large cadre of colleagues working on the DUNE far and near detectors. Organization information, links relevant to working groups, operations and...
WLCG relies on the network as a critical part of its infrastructure and therefore needs to guarantee effective network usage and prompt detection and resolution of any network issues, including connection failures, congestion and traffic routing. The OSG Networking Area, in partnership with the WLCG Throughput working group, has created a monitoring infrastructure that gathers metrics from the...
Particle identification is an important ingredient to particle physics experiments. Distinguishing the charged hadrons (pions, kaons, protons and their antiparticles) is often crucial, in particular for hadronic decays which could be studied with an efficient particle identification to obtain a desirable signal-to-background ratio. An optimal performance of particle identification in a large...
Modern large distributed computing systems produce large amounts of monitoring data. In order for these systems to operate smoothly, under-performing or failing components have to be identified quickly, and preferably automatically, enabling the system managers to react accordingly.
In this contribution, we analyze job and data transfer data collected in the running of the LHC computing...
Many large-scale physics experiments, such as ATLAS at the Large Hadron Collider, Deep Underground Neutrino Experiment and sPHENIX at the Realistic Heavy Ion Collider, rely on accurate simulations to inform data analysis and derive scientific results. Their inevitable inaccuracies may be detected and corrected using heuristics in a conventional analysis workflow.
However, residual errors...
The ATLAS Spanish Tier-1 and Tier-2s have more than 18 years of experience in the deployment and development of LHC computing components and their successful operations. The sites are actively participating in, and even coordinating, R&D computing activities in the LHC Run3 and developing the computing models needed in HL-LHC period.
In this contribution, we present details on the...
Most common CPU architectures provide simultaneous multithreading ( SMT). Thereby, the operating system sees, per physical core, two logical cores and can schedule two processes to one physical CPU core. This overbooking of physical cores enabled a better usage of parallel pipelines and doubled components within a CPU core. On systems with several applications running in parallel, such as...
ML/DL techniques have shown their power in the improvement of several studies and tasks in HEP, in particular, especially in physics analysis. Our approach has been to take a number of the ML/DL tools provided by several Open Source platforms applying them to several classification problems, for instance, to the ttbar resonance extraction in LHC experiment.
A comparison has been made...
In High Energy Physics, detailed and time-consuming simulations are used for particle interactions with detectors. To bypass these simulations with a generative model, the generation of large point clouds in a short time is required, while the complex dependencies between the particles must be correctly modeled. Particle showers are inherently tree-based processes, as each particle is produced...
ATLAS Metadata Interface (AMI) is a generic ecosystem for metadata aggregation, transformation and cataloging. Benefiting from more than 20 years of feedback in the LHC context, the second major version was released in 2018. This poster describes how a renewed architecture and integration with modern technologies ease the usage and deployment of a complete AMI stack. It describes how to deploy...
Liquid argon time projection chambers (TPCs) are widely used in particle detection. High quality physics simulators have been developed for such detectors in a variety of experiments, and the resulting simulations are used to aid in reconstruction and analysis of collected data. However, the degree to which these simulations are reflective of real data is limited by the knowledge of the...
As a conventional ptychography iteration reconstruction method, Difference Map (DM) is suitable for computing large datasets of X-ray diffraction patterns on multi-GPU heterogeneous system at HEPS (High Energy Photon Source). However, it is based on the fact that the GPU memory is big enough to handle CUDA FFT/IFFT to all the patterns to GPU from RAM. Meanwhile, the intermediate data during...
Providing high performance and reliable tape storage system is GridKa's top priority. The GridKa tape storage system was recently migrated from IBM SP to HighPerformance Storage System (HPSS) for LHC and non-LHC HEP experiments. These are two different tape backends and each has its own design and specifics that need to be studied and understood very deeply. Taking into account the features...
We present a case for ARM chips as an alternative to standard x86 CPUs at WLCG sites, as we observed better performance and lower power consumption in a number of benchmarks based on actual HEP workloads, from ATLAS simulation and reconstruction tasks to the most recent HEP-Score containerised jobs.
To support our case, we present novel measurements on the performance and energy...
Starting this year, the upgraded LHCb detector is collecting data with a pure software trigger. In its first stage, reducing the rate from 30MHz to about 1MHz, GPUs are used to reconstruct and trigger on B and D meson topologies and high-pT objects in the event. In its second stage, a CPU farm is used to reconstruct the full event and perform candidate selections, which are persisted for...
Increases in data volumes are forcing high-energy and nuclear physics experiments to store more frequently accessed data on tape. Extracting the maximum performance from tape drives is critical to make this viable from a data availability and system cost standpoint. The nature of data ingest and retrieval in an experimental physics environment make achieving high access performance difficult...
TFPWA is a general framework for partial wave analysis developed based on TensorFlow2. Partial wave analysis is a powerful method to determinate the 4-momentum distribution of multi-body decay final states and to extract the interested internal information. Base on a simple topological representation, TF-PWA can deal with most of the processes of partial wave analysis automatically and...
Decay products of long-lived particles are an important signature in dark sector searches in collider experiments. The current Belle II tracking algorithm is optimized for tracks originating from the interaction point at the cost of a lower track finding efficiency for displaced tracks. This is especially the case for low-momentum displaced tracks that are crucial for dark sector searches in...
Multi-messenger astrophysics provides valuable insights into the properties of the physical Universe. These insights arise from the complementary information carried by photons, gravitational waves, neutrinos and cosmic rays about individual cosmic sources and source populations.
When a gravitational wave (GW) candidate is identified by the Ligo, Virgo and Kagra (LVK) observatory network, an...
The ATLAS EventIndex is the global catalogue of all ATLAS real and simulated events. During the LHC long shutdown between Run 2 (2015-2018) and Run 3 (2022-2025) its components were substantially revised, and a new system has been deployed for the start of Run 3 in Spring 2022. The new core storage system is based on HBase tables with a Phoenix interface. It allows faster data ingestion rates...
The computing resource needs of LHC experiments, such as CMS, are expected to continue growing significantly over the next decade, during the Run 3 and especially the HL-LHC era. Additionally, the landscape of available resources will evolve, as HPC (and Cloud) resources will provide a comparable, or even dominant, fraction of the total capacity, in contrast with the current situation,...
The IceCube Neutrino Observatory is a cubic kilometer neutrino telescope located at the geographic South Pole. Understanding detector systematic effects is a continuous process. This requires the Monte Carlo simulation to be updated periodically to quantify potential changes and improvements in science results with more detailed modeling of the systematic effects. IceCube’s largest systematic...
The IceCube Neutrino Observatory is a cubic kilometer neutrino telescope located at the geographic South Pole. IceCube recently completed a multi-year user management migration from a manual LDAP-based process to an automated and integrated system built on Keycloak. A custom interface allows lead scientists to register and edit their institution’s user accounts, including group memberships...
The complexity of the ATLAS detector and its infrastructure require an excellent understanding of the interdependencies between the components when identifying points of failure (POF). The ATLAS Technical Coordination Expert System features a graph-based inference engine that identifies a POF provided a list of faulty elements or Detector Safety System (DSS) alarms. However, the current...
DIRAC is a widely used “framework for distributed computing”. It works by building a layer between the users and the resources offering a common interface to a number of heterogeneous providers. DIRAC, like many other workload management systems, uses pilot jobs to check and configure the worker-node environment before fetching a user payload. The pilot also records a number of different...
This contribution introduces the job optimizer service for the next-
generation ALICE grid middleware, JAliEn (Java Alice Environment).
It is a continuous service running on central machines and is essentially
responsible for splitting jobs into subjobs, to then be distributed and
executed on the ALICE grid. There are several ways of creating subjobs
based on various strategies relevant...
The Super Tau-Charm Facility (STCF) is a future electron-positron collider proposed in China. It has a peak luminosity of above $0.5×10^{35}$ $cm^{-2}s^{-1}$ and center-of-mass energy ranging from 2 to 7 GeV. A time-of-flight detector based on detection of internally reflected Cherenkov light (DTOF), is proposed for the endcap particle identification (PID) at the STCF. In this contribution, we...
In this work, we report the status of a neural network regression model trained to extract new physics (NP) parameters in Monte Carlo (MC) data. We utilize a new EvtGen NP MC generator to generate $B \rightarrow K^{*} \ell^{+} \ell^{-}$ events according to the deviation of the Wilson Coefficient $C_{9}$ from its SM value, $\delta C_{9}$. We train a convolutional neural network regression...
ROOT RNTuple I/O subsystem has been designed to address performance bottlenecks and shortcomings of ROOT's current state of the art TTree I/O subsystem. RNTuple provides a backwards-incompatible redesign of the TTree binary format and API that evolves the ROOT event data I/O for the challenges of the upcoming decades. It has been engineered for high-performance on modern storage hardware, a...
Reweighting Monte Carlo (MC) events for alternate benchmarks of beyond standard model (BSM) physics is an effective way to reduce the computational cost of physics searches. However, applicability of reweighting is often constrained by technical limitations. We demonstrate how pre-trained neural networks can be used to obtain fast and reliable reweighting without relying on the full MC...
dCache at BNL has been in production for almost two decades. For years, dCache used the default driver included with the dCache software to interface with HPSS tape storage systems. Due to the synchronous nature of this approach and the high resource demands resulting from periodic script invocations, scalability was significantly limited. During the WLCG tape challenges, bottlenecks in dCache...
Particle accelerators, such as the Spallation Neutron Source (SNS), require high beam availability in order to maximize scientific discovery. Recently, researchers have made significant progress utilizing machine learning (ML) models to identify anomalies, prevent damage, reduce beam loss, and tune accelerator parameters in real time. In this work, we study the use of uncertainty aware...
OpenMP is a directive based shared-memory parallel programming model traditionally used for multicore CPUs. In its recent versions, OpenMP was extended to enable GPU computing via its “target
offloading” model. The architecture agnostic compiler directives can in principle offload to multiple types of GPUs and FPGAs, and its compiler support is under active development.
In this work, we...
The Wide Field-of-view Cerenkov Telescope Array (WFCTA) is an important component of Large High Altitude Air Shower Observatory (LHAASO), which aims to measure the individual energy spectra of cosmic rays from ~30 TeV to a couple of EeV. Since the experiment started running in 2020, WFCTA simulation jobs have been running on the Intel X86 cluster last year and only 25% of the first stage...
During the first WLCG Network Data Challenge in fall of 2021 some shortcomings in the monitoring that impeded the ability to fully understand the results collected during the data challenge were identified. One of the simplest missing components was site specific network information, especially information about traffic going into and out of any of the participating sites. Without this...
Since 2009 Port d’Informació Científica (PIC), in Barcelona, has hosted the Major Atmospheric Gamma Imaging Cherenkov (MAGIC) Data Center. At the Observatorio del Roque de Los Muchachos (ORM) in the Canary Island of La Palma (Spain), data produced from observations by the 17m diameter MAGIC telescopes are transferred to PIC on a daily basis. More than 200 TB per year are being transferred to...
In this paper we describe the development of a streamlined framework for large-scale ATLAS pMSSM reinterpretations of LHC Run 2 analyses using containerised computational workflows. The project is looking to assess the global coverage of BSM physics and requires running O(5k) computational workflows representing pMSSM model points. Following ATLAS Analysis Preservation policies, many analyses...
Cutting edge research has driven scientists in many fields into the world of big data. While data storage technologies continue to evolve, the costs remain high for rapid data access on such scales and are a major factor in planning and operations. As a joint effort spanning experiment scientists, developers for ROOT, and industrial leaders in data compression, we sought to address this...
The European energy crisis during 2022 prompted many computing facilities to take urgent electricity saving measures. In part voluntary, in response to EU and national appeals, but also to keep within a flat energy budget, as the electricity price rose. We review some of these measures and, as the situation normalises, take a longer view on how the flexibility of high throughput computing can...
With over 2000 active members from 174 institutes over 41 countries in the world, the ALICE experiment is one of the 4 large experiments at CERN. With such numerous interactions, the experiment management needs a way to record members participation history and their current status, such as employments, institutes, appointments, clusters and funding agencies, as well as to automatically...
The ATLAS experiment has 18+ years of experience using workload management systems to deploy and develop workflows to process and to simulate data on the distributed computing infrastructure. Simulation, processing and analysis of LHC experiment data require the coordinated work of heterogeneous computing resources. In particular, the ATLAS experiment utilizes the resources of 250 computing...
Every sub-detector in the ATLAS experiment at the LHC, including ATLAS
Calorimeter, writes conditions and calibration data into an ORACLE
Database (DB). In order to provide an interface for reliable interactions
both with the Conditions Online and Offline DBs a unique semiautomatic
web based application, TileCalibWeb Robot, has been developed. TileCalibWeb is being used in LHC Run 3 by...
new methodology to improve the sensitivity to new physics contributions to the Standard Model processes at LHC is presented.
A Variational AutoEncoder trained on Standard Model processes is used to identify Effective Field Theory contributions as anomalies. While the output of the model is supposed to be very similar to the inputs for Standard Model events, it is expected to deviate...
ATLAS Metadata Interface (AMI) is a generic ecosystem for metadata aggregation, transformation and cataloging. Benefiting from more than 20 years of feedback in the LHC context, the second major version was released in 2018. Each sub-system of the stack has recently be improved in order to acquire messaging/telemetry capabilities. This poster describes the whole stack monitoring with the...
The Worldwide Large Hadron Collider Computing Grid (WLCG) actively pursues the migration from the protocol IPv4 to IPv6. For this purpose, the HEPiX-IPv6 working group was founded during the fall HEPiX Conference in 2010. One of the first goals was to categorize the applications running in the WLCG into different groups: the first group was easy to define, because it comprised of all...