Conveners
Track 1 - Data and Metadata Organization, Management and Access: Storage
- Mario Lassnig (CERN)
- Martin Barisits (CERN)
Track 1 - Data and Metadata Organization, Management and Access: Networks
- Mario Lassnig (CERN)
- Diego Davila (University of California, San Diego)
Track 1 - Data and Metadata Organization, Management and Access: Clouds & Caches
- Martin Barisits (CERN)
- Michael Kirby (FNAL)
Track 1 - Data and Metadata Organization, Management and Access: Tapes
- Diego Davila (University of California, San Diego)
- Mario Lassnig (CERN)
Track 1 - Data and Metadata Organization, Management and Access: Databases & Metadata
- Mario Lassnig (CERN)
- Martin Barisits (CERN)
Track 1 - Data and Metadata Organization, Management and Access: Data Management
- Michael Kirby (FNAL)
- Martin Barisits (CERN)
Track 1 - Data and Metadata Organization, Management and Access: Analytics & Benchmarks
- Diego Davila (University of California, San Diego)
- Michael Kirby (FNAL)
The dCache project provides open-source software deployed internationally to satisfy
ever more demanding storage requirements. Its multifaceted approach provides an integrated
way of supporting different use-cases with the same storage, from high throughput data
ingest, data sharing over wide area networks, efficient access from HPC clusters and long
term data persistence on a tertiary...
XRootD implemented a client-side erasure coding (EC) algorithm utilizing the Intel Intelligent Storage Acceleration Library. At SLAC, a prototype of XRootD EC storage was set up for evaluation. The architecture and configuration of the prototype is almost identical to that of a traditional non-EC XRootD storage behind a firewall: a backend XRootD storage cluster in its simplest form, and an...
INFN-CNAF is one of the Worldwide LHC Computing Grid (WLCG) Tier-1 data centers, providing support in terms of computing, networking, storage resources and services also to a wide variety of scientific collaborations, ranging from physics to bioinformatics and industrial engineering.
Recently, several collaborations working with our data center have developed computing and data management...
The Storage Group in the CERN IT Department operates several Ceph storage clusters with an overall capacity exceeding 100 PB. Ceph is a crucial component of the infrastructure delivering IT services to all the users of the Organization as it provides: i) Block storage for the OpenStack infrastructure, ii) CephFS used as persistent storage by containers (OpenShift and Kubernetes) and as shared...
Data access at the UK Tier-1 facility at RAL is provided through its ECHO storage, serving the requirements for the WLGC and increasing numbers of other HEP and astronomy related communities.
ECHO is a Ceph-backed erasure-coded object store, currently providing in excess of 40PB of usable space, with frontend access to data provided via XRootD or gridFTP, using the libradosstriper library of...
EOS has been the main storage system at CERN for more than a decade, continuously improving in order to meet the ever evolving requirements of the LHC experiments and the whole physics user community. In order to satisfy the demands of LHC Run-3, in terms of storage performance and tradeoff between cost and capacity, EOS was enhanced with a set of new functionalities and features that we will...
The Large Hadron Collider (LHC) experiments distribute data by leveraging a diverse array of National Research and Education Networks (NRENs), where experiment data management systems treat networks as a “blackbox” resource. After the High Luminosity upgrade, the Compact Muon Solenoid (CMS) experiment alone will produce roughly 0.5 exabytes of data per year. NREN Networks are a critical part...
We present an NDN-based Open Storage System (OSS) plugin for XRootD instrumented with an accelerated packet forwarder, built for data access in the CMS and other experiments at the LHC, together with its current status, performance as compared to other tools and applications, and plans for ongoing developments.
Named Data Networking (NDN) is a leading Future Internet Architecture where data...
There is increasing demand for the efficiency and flexibility of data transport systems supporting data-intensive sciences. With growing data volume, it is essential that the transport system of a data-intensive science project fully utilize all available transport resources (e.g., network bandwidth); to achieve statistical multiplexing gain, there is an increasing trend that multiple projects...
In 2029 the LHC will start the High-Luminosity LHC (HL-LHC) program, with a boost in the integrated luminosity resulting in an unprecedented amount of experimental and simulated data samples to be transferred, processed and stored in disk and tape systems across the Worldwide LHC Computing Grid (WLCG). Content delivery network (CDN) solutions are being explored with the purposes of improving...
The High-Energy Physics (HEP) and Worldwide LHC Computing Grid (WLCG) communities have faced significant challenges in understanding their global network flows across the world’s research and education (R&E) networks. When critical links, such as transatlantic or transpacific connections, experience high traffic or saturation, it is very challenging to clearly identify which collaborations...
The capture and curation of all primary instrument data is a potentially valuable source of added insight into experiments or diagnostics in laboratory experiments. The data can, when properly curated, enable analysis beyond the current practice that uses just a subset of the as-measured data. Complete curated data can also be input for machine learning and other data exploration tools....
The Xrootd S3 Gateway is a universal high performance proxy service that can be used to access S3 portals using existing HEP credentials (e.g. JSON Web Tokens and x509). This eliminates one of the biggest roadblocks to using public cloud storage resources. This paper describes how the S3 Gateway leverages existing HEP software (e.g. Davix and XRootD) to provide a familiar scalable service that...
There has been a significant increase in data volume from various large scientific projects, including the Large Hadron Collider (LHC) experiment. The High Energy Physics (HEP) community requires increased data volume on the network, as the community expects to produce almost thirty times annual data volume between 2018 and 2028 [1]. To mitigate the repetitive data access issue and network...
Current and future distributed HENP data analysis infrastructures rely increasingly on object stores in addition to regular remote file systems. Such file-less storage systems are popular as a means to escape the inherent scalability limits of the POSIX file system API. Cloud storage is already dominated by S3-like object stores, and HPC sites are starting to take advantage of object stores...
At Brookhaven National Lab, the dCache storage management system is used as a disk cache for large high-energy physics (HEP) datasets primarily from the ATLAS experiment[1]. Storage space on dCache is considerably smaller than the full ATLAS data collection. Therefore, a policy is needed to determine what data files to keep in the cache and what files to evict. A good policy is to keep...
In this talk, we present a novel data format design that obviates the need for data tiers by storing individual event data products in column objects. The objects are stored and retrieved through Ceph S3 technology, and a companion metadata system handles tracking of the object lifecycle. Performance benchmarks of data storage and retrieval will be presented, along with scaling tests of the...
Rucio is a software framework that provides scientific collaborations with the ability to organise, manage and access large volumes of data using customisable policies. The data can be spread across globally distributed locations and across heterogeneous data centres, uniting different storage and network technologies as a single federated entity. Rucio offers advanced features such as...
The goal of the “HTTP REST API for Tape” project is to provide a simple, minimalistic and uniform interface to manage data transfers between Storage Endpoints (SEs) where the source file is on tape. The project is a collaboration between the developers of WLCG storage systems (EOS+CTA, dCache, StoRM) and data transfer clients (gfal2, FTS). For some years, HTTP has been growing in popularity as...
CDS (Custodial Disk Storage), a disk-based custodial storage powered by CERN EOS storage system, has been operating for the ALICE experiment at the KISTI Tier-1 Centre since November 2021. The CDS replaced existing tape storage operated for almost a decade, after its stable demonstration in the WLCG Tape Challenges in October 2021. We tried to challenge the economy of tape storage in the...
The High Luminosity upgrade to the LHC (HL-LHC) is expected to deliver scientific data at the multi-exabyte scale. In order to address this unprecedented data storage challenge, the ATLAS experiment launched the Data Carousel project in 2018. Data Carousel is a tape-driven workflow whereby bulk production campaigns with input data resident on tape are executed by staging and promptly...
The CERN IT Department is responsible for ensuring the integrity and security of data stored in the IT Storage Services. General storage backends such as EOSHOME/PROJECT/MEDIA and CEPHFS are used to store data for a wide range of use cases for all stakeholders at CERN, including experiment project spaces and user home directories.
In recent years a backup system, CBACK, was developed based...
The CERN Tape Archive (CTA) was conceived as the successor to CASTOR and as the tape back-end to EOS, designed for the archival storage of data from LHC Run-3 and other experimental programmes at CERN. In the wider WLCG, the tape software landscape is quite heterogenous, but we are now entering a period of consolidation. This has led to a number of sites in WLCG (and beyond) reevaluating their...
The development of an LHC physics analysis involves numerous investigations that require the repeated processing of terabytes of measured and simulated data. Thus, a rapid processing turnaround is beneficial to the scientific process. We identified two bottlenecks in analysis independent algorithms and developed the following solutions.
First, inputs are now cached on individual SSD caches of...
Rucio is a Data Management software that has become a de-facto standard in the HEP community and beyond. It allows the management of large volumes of data over their full lifecycle. The Belle II experiment located at KEK (Japan) recently moved to Rucio to manage its data over the coming decade (O(10) PB/year). In addition to its Data Management functionalities, Rucio also provides support for...
The ATLAS experiment is preparing a major change in the conditions data infrastructure in view of Run4 In this presentation we will expose the main motivations for the new design (called CREST for Conditions-REST), the ongoing changes in the DB architecture and present the developments for the deployment of the new system. The main goal is to setup a parallel infrastructure for full scale...
The ALICE experiment at CERN has undergone a substantial detector, readout and software upgrade for the LHC Run3. A signature part of the upgrade is the triggerless detector readout, which necessitates a real time lossy data compression from 1.1TB/s to 100GB/s performed on a GPU/CPU cluster of 250 nodes. To perform this compression, a significant part of the software, which traditionally is...
The HSF Conditions Databases activity is a forum for cross-experiment discussions hoping for as broad a participation as possible. It grew out of the HSF Community White Paper work to study conditions data access, where experts from ATLAS, Belle II, and CMS converged on a common language and proposed a schema that represents best practice. The focus of the HSF work is the most difficult use...
The ATLAS EventIndex is a global catalogue of the events collected, processed or generated by the ATLAS experiment. The system was upgraded in advance of LHC Run 3, with a migration of the Run 1 and Run 2 data from HDFS MapFiles to HBase tables with a Phoenix interface. The frameworks for testing functionality and performance of the new system have been developed. There are two types of tests...
The Data Lake concept has promised increased value to science and more efficient operations for storage compared to the traditional isolated storage deployments. Building on the established distributed dCache serving as the Nordic Tier-1 storage for LHC data, we have also integrated tier-2 pledged storage in Slovenia, Sweden, and Switzerland, resulting in a coherent storage space well above...
China’s High Energy Photon Source (HEPS), the first national high-energy synchrotron radiation light source and soon one of the world’s brightest fourth-generation synchrotron radiation facilities, is being under intense construction in Beijing’s Huairou District, and will be completed in 2025.
To make sure that the huge amount of data collected at HEPS is accurate, available and...
The [Vera C. Rubin observatory][1] is preparing for execution of the most ambitious astronomical survey ever attempted, the Legacy Survey of Space and Time (LSST). Currently in its final phase of construction in the Andes mountains in Chile and due to start operations late 2024 for 10 years, its 8.4-meter telescope will nightly scan the southern sky and collect images of the entire visible sky...
ALICE is one of the four large experiments at the CERN LHC designed to study the structure and origins of matter in collisions of heavy ions (and protons) at ultra-relativistic energies. The experiment measures the particles produced as a result of collisions in its center so that it can reconstruct and study the evolution of the system produced during these collisions. To perform these...
The File Transfer System (FTS) is a software system responsible for queuing, scheduling, dispatching and retrying file transfer requests, it is used by three of the LHC experiments, namely ATLAS, CMS and LHCb, as well as non LHC experiments including AMS, Dune and NA62. FTS is critical to the success of many experiments and the service must remain available and performant during the entire...
HPC systems are increasingly often used for addressing various challenges in high-energy physics. But often the data infrastructures used in the latter area are not well integrated with infrastructures that include HPC resources. Here we will focus on a specific infrastructure, namely Fenix, which is based on a consortium of 6 leading European supercomputing centres. The Fenix sites are...
The rapid growth of scientific data and the computational needs of BNL-supported science programs will bring the Scientific Data and Computing Center (SDCC) to the Exabyte scale in the next few years. The SDCC Storage team is responsible for the symbiotic development and operations of storage services for all BNL experiment data, in particular for the data generated by the ATLAS experiment...
In the HEP community, the prediction of Data Popularity is a topic that has been approached for many years. Nonetheless, while facing increasing data storage challenges, especially in the HL-LHC era, we are still in need for better predictive models to answer the questions of whether particular data should be kept, replicated, or deleted.
The usage of caches proved to be a convenient...
Complete and reliable monitoring of the WLCG data transfers is an important condition for effective computing operations of the LHC experiments. WLCG data challenges organised in 2021 and 2022 highlighted the need for improvements in the monitoring of data traffic on the WLCG infrastructure. In particular, it concerns the implementation of the monitoring of the remote data access via the...
Due to the increased demand of network traffic expected during the HL-LHC era, the T2 sites in the USA will be required to have 400Gbps of available bandwidth to their storage solution.
With the above in mind we are pursuing a scale test of XRootD software when used to perform Third Party Copy transfers using the HTTP protocol. Our main objective is to understand the possible limitations in...
In preparation for the second runs of the ProtoDUNE detectors at CERN (NP02 and NP04), DUNE has established a new data pipeline for bringing the data from the EHN-1 experimental hall at CERN to primary tape storage at Fermilab and CERN, and then spreading it out to a distributed disk data store at many locations around the world. This system includes a new Ingest Daemon and a new Declaration...
The LArSoft/art framework is used at Fermilab’s liquid argon time projection chamber experiments such as ICARUS to run traditional production workflows in a grid environment. It has become increasingly important to utilize HPC facilities for experimental data processing tasks. As part of the SciDAC-4 HEP Data Analytics on HPC and HEP Event Reconstruction with Cutting Edge Computing...