Conveners
Track 5 - Sustainable and Collaborative Software Engineering: Sustainable Languages and Architectures
- Elizabeth Sexton (FNAL)
- Andre Sailer (CERN)
Track 5 - Sustainable and Collaborative Software Engineering: Sustainable CI and Build Infrastructure
- Eduardo Rodrigues (University of Liverpool)
- Andre Sailer (CERN)
Track 5 - Sustainable and Collaborative Software Engineering: Glance and Web based applications
- Elizabeth Sexton (FNAL)
- Giulio Eulisse
Track 5 - Sustainable and Collaborative Software Engineering: Sustainable Analysis
- Elizabeth Sexton (FNAL)
- Eduardo Rodrigues (University of Liverpool)
Track 5 - Sustainable and Collaborative Software Engineering: MISC: MonteCarlos, Infrastructure and Simulation
- Giulio Eulisse
- Andre Sailer (CERN)
Track 5 - Sustainable and Collaborative Software Engineering: Sustainable Frameworks
- Eduardo Rodrigues (University of Liverpool)
- Giulio Eulisse
The Julia programming language was created 10 years ago and is now a mature and stable language with a large ecosystem including more than 8,000 third-party packages. It was designed for scientific programming to be a high-level and dynamic language as Python is, while achieving runtime performances comparable to C/C++ or even faster. With this, we ask ourselves if the Julia language and its...
The evaluation of new computing languages for a large community, like HEP, involves comparison of many aspects of the languages' behaviour, ecosystem and interactions with other languages. In this paper we compare a number of languages using a common, yet non-trivial, HEP algorithm: the tiled $N^2$ clustering algorithm used for jet finding. We compare specifically the algorithm implemented in...
With an increased dataset obtained during the Run-3 of the LHC at CERN and the even larger expected increase of the dataset by more than one order of magnitude for the HL-LHC, the ATLAS experiment is reaching the limits of the current data processing model in terms of traditional CPU resources based on x86_64 architectures and an extensive program for software upgrades towards the HL-LHC has...
High Energy Physics software has been a victim of the necessity to choose one implementation language as no really usable multi-language environment existed. Even a co-existence of two languages in the same framework (typically C++ and Python) imposes a heavy burden on the system. The role of different languages was generally limited to well encapsulated domains (like Web applications,...
Software and computing are an integral part of our research. According to the survey for the “Future Trends in Nuclear Physics Computing” workshop in September 2020, students and postdocs spent 80% of their time on the software and computing aspects of your research. For the Electron-Ion Collider, we are looking for ways to make software (and computing) "easier" to use. All scientists of all...
The HSF/IRIS-HEP Software Training group provides software training skills to new researchers in High Energy Physics (HEP) and related communities. These skills are essential to produce high-quality and sustainable software needed to do the research. Given the thousands of users in the community, sustainability, though challenging, is the centerpiece of its approach. The training modules are...
CERN hosts more than 1200 websites essential for the mission of the Organization, internal and external collaboration and communicaiton as well as public outreach. The complexity and scale of CERN’s online presence is very diverse with some websites, like https://home.cern/
, accommodating more than one million unique visitors in a day.
However, regardless of their diversity, all...
The Jiangmen Underground Neutrino Observatory (JUNO), under construction in South China, primarily aims to determine the neutrino mass hierarchy and the precise measure oscillation parameters. The data-taking is expected to start in 2024 and plans to run for more than 20 years. The development of JUNO offline software (JUNOSW) started in 2012, and it is quite challenging to maintain the JUNOSW...
The ATLAS Continuous Integration (CI) System is the major component of the ATLAS software development infrastructure, synchronizing efforts of several hundred software developers working around the world and around the clock. Powered by 700 fast processors, it is based on the ATLAS GitLab code management service and Jenkins CI server and performs daily up to 100 ATLAS software builds probing...
The ALICE experiment at CERN uses a cluster consisting of virtual and bare-metal machines to build and test proposed changes to the ALICE Online-Offline (O2) software in addition to building and publishing regular software releases.
Nomad is a free and open-source job scheduler for containerised and non-containerised applications developed by Hashicorp. It is integrated into an...
GitLab has been running at CERN since 2012. It is a self-service code hosting application based on Git that provides collaboration and code review features, becoming one of the key infrastructures at CERN. It is being widely used at CERN, with more than 17 000 active users, hosting more than 120 000 projects and triggering more than 5 000 jobs per hour.
On its initial stage, a custom-made...
The ATLAS experiment involves almost 6000 members from approximately 300 institutes spread all over the globe and more than 100 papers published every year. This dynamic environment brings some challenges such as how to ensure publication deadlines, communication between the groups involved, and the continuity of workflows. The solution found for those challenges was automation, which was...
As the largest particle physics laboratory in the world, CERN has more than 17000 collaborators spread around the globe. ATLAS, one of CERN’s experiments, has around 6000 active members and 300 associate institutes, all of which must go through the standard registration and updating procedures within CERN’s HR (Foundation) database. Simultaneously, the ATLAS Glance project, among other...
The LHCb experiment is one of the 4 LHC experiments at CERN. With more than 1500 members and tens of thousands of assets, the Collaboration requires systems that allow the extraction of data from many databases according to some very specific criteria. In LHCb there are 4 production web applications responsible for managing members and institutes, tracking assets and their current status,...
The Glance project is responsible for over 20 systems across three CERN experiments: ALICE, ATLAS and LHCb. Students, engineers, physicists and technicians have been using systems designed and managed by Glance on a daily basis for over 20 years. In order to produce quality products continuously, considering internal stakeholder's ever-evolving requests, there is the need of standardization....
The recent major upgrade of the ALICE Experiment at CERN’s Large Hadron Collider has been coupled with the development of a new Online-Offline computing system capable of interacting with a sustained input throughput of 3.5TB/s. To facilitate the control of the experiment, new web applications have been developed and deployed to be used 24 hours a day, 365 days a year in the control room and...
CERN, as many large organizations, relies on multiple communication means for different use-cases and teams.
Email and mailing lists are the most popular ones, but more modern communications systems gain traction such as Mattermost and Push notifications.
On one end of the spectrum we have communication teams writing individual emails to users on a daily basis, which may be small targets, or...
The recent release of AwkwardArray 2.0 significantly changes the way that lazy evaluation and task-graph building are handled in columnar analysis. The Dask parallel processing library is now used for these pieces of functionality with AwkwardArray, and this change affords new ways of optimizing columnar analysis and distributing it on clusters. In particular this allows optimization of a task...
In particle physics, workflow management systems are primarily used as tailored solutions in dedicated areas such as Monte Carlo production. However, physicists performing data analyses are usually required to steer their individual, complex workflows manually, frequently involving job submission in several stages and interaction with distributed storage systems by hand. This process is not...
Data analysis in particle physics is socially distributed: unlike centrally developed and executed reconstruction pipelines, the analysis work performed after Analysis Object Descriptions (AODs) are made and before the final paper review—which includes particle and event selection, systematic error handling, decay chain reconstruction, histogram aggregation, fitting, statistical models, and...
In the LHCb experiment, a wide variety of Monte Carlo simulated samples need to be produced for the experiment’s physics programme. LHCb has a centralised production system for simulating, reconstructing and processing collision data, which runs on the DIRAC backend on the WLCG.
To cope with a large set of different types of sample, requests for simulation production are based on a concept of...
With the construction and operation of fourth-generation light sources like European Synchrotron Radiation Facility Extremely Brilliant Source (ESRF-EBS), Advanced Photon Source Upgrade (APS-U), Advanced Light Source Upgrade (ALS-U), High Energy Photon Source (HEPS), etc., several advanced biological macromolecule crystallography (MX) beamlines are or will be built and thereby the huge amount...
Monte Carlo simulations are a key tool for the physics program of High Energy Experiments. Their accuracy and reliability is of the utmost importance. A full suite of verifications is in place for the LHCb Simulation software to ensure the quality of the simulated samples produced.
In this contribution we will give a short overview of the procedure and the tests in place, that exploits the...
EPOS 4 is the last version of the high-energy collision event generator EPOS, released publicly in 2022. It was delivered with improvements on several aspects, whether about the theoretical bases on which it relies, how they are handled technically, or regarding user's interface and data compatibility.
This last point is especially important, as part of a commitment to provide the widest...
A mechanism to store in databases all the parameters needed to simulate the detectors response to physics interactions is presented. This includes geometry, materials, magnetic field, electronics.
GEMC includes a python API to populate the databases, and the software to run the Monte-Carlo simulation. The engine is written in C++ and uses Geant4 for the passage of particles through...
The LHCb software has undergone a major upgrade in view of data taking with higher luminosity in Run3 of the LHC at CERN.
The LHCb simulation framework, Gauss, had to be adapted to follow the changes in modern technologies of the underlying experiment core software and to introduce new simulation techniques to cope with the increase of the required amount of simulated data. Additional...
Gaussino is a new simulation experiment-independent framework based on the Gaudi data processing framework. It provides generic core components and interfaces to build a complete simulation application: generation, detector simulation, geometry, monitoring, and saving of the simulated data. Thanks to its highly configurable and extendable components Gaussino can be used both as a toolkit and a...
LCIO is a persistency framework and event data model originally developed to foster closer collaboration among the international groups conducting simulation studies for future linear colliders. In the twenty years since its introduction at CHEP 2003 it has formed the backbone for ILC and CLIC physics and detector studies. It has also been successfully employed to study and develop other...
Modern HEP workflows must manage increasingly large and complex data collections. HPC facilities may be employed to help meet these workflows' growing data processing needs. However, a better understanding of the I/O patterns and underlying bottlenecks of these workflows is necessary to meet the performance expectations of HPC systems.
Darshan is a lightweight I/O characterization tool that...
HEP data-processing frameworks are essential ingredients in getting from raw data to physics results. But they are often tricky to use well, and they present a significant learning barrier for the beginning HEP physicist. In addition, existing frameworks typically support rigid, collider-based data models, which do not map well to neutrino-physics experiments like DUNE. Neutrino physicists...
The Belle II software was developed as closed source. As several HEP experiments released their code to the public, the topic of open source software was also discussed within the Belle II collaboration. A task force analyzed advantages and disadvantages and proposed a policy which was adopted by the collaboration in 2020. The Belle II offline software was then released under an open source...
Detector studies for future experiments rely on advanced software
tools to estimate performance and optimize their design and technology
choices. The Key4hep project provides a flexible turnkey solution for
the full experiment life-cycle based on established community tools
such as ROOT, Geant4, DD4hep, Gaudi, podio and spack. Members of the
CEPC, CLIC, EIC, FCC, and ILC communities have...
Open Data Detector (ODD) is a detector for algorithm research and development. The tracking system is an evolution of the detector used in the successful Tracking Machine Learning Challenge. It offers a more realistic design, with a support structure, cables, and cooling pipes. The ODD got extended with granular calorimetry and can be completed in future with a muon system. The magnetic field...