Indico is back online after maintenance on Tuesday, April 30, 2024.
Please visit Jefferson Lab Event Policies and Guidance before planning your next event: https://www.jlab.org/conference_planning.

May 8 – 12, 2023
Norfolk Waterside Marriott
US/Eastern timezone

Federated Access to Distributed Storage in the EIC Computing Era

May 11, 2023, 11:15 AM
15m
Marriott Ballroom II-III (Norfolk Waterside Marriott)

Marriott Ballroom II-III

Norfolk Waterside Marriott

235 East Main Street Norfolk, VA 23510
Oral Track 4 - Distributed Computing Track 4 - Distributed Computing

Speaker

Poat, Michael (BNL)

Description

The Electron Ion Collider (EIC) collaboration and future experiment is a unique scientific ecosystem within Nuclear Physics as the experiment starts right off as a cross-collaboration from Brookhaven National Lab (BNL) & Jefferson Lab (JLab). As a result, this muti-lab computing model tries at best to provide services accessible from anywhere by anyone who is part of the collaboration. While the computing model for the EIC is not finalized, it is anticipated that the computational and storage resources will be made accessible to a wide range of collaborators across the world. The use of federated ID seems to be a critical element to the strategy of providing such services, allowing seamless access to each lab site computing resources. However, providing Federated access to a Federated storage is not a trivial matter and has its share of technical challenges.

In this contribution, we will focus on the steps we took towards the deployment of a distributed object storage system that integrates with Amazon S3 and Federated ID. We will first cover for and explain the first stage storage solutions provided to the EIC during the detector design phase. Our initial test deployment consisted of Lustre storage using MinIO, hence providing an S3 interface. High Availability load balancers were added later to provide the initial scalability it lacked. Performance of that system will be shown. While this embryonic solution worked well, it had many limitations. Looking ahead, the Ceph object storage is considered a top-of-the-line solution in the storage community - since the Ceph Object Gateway is compatible with the Amazon S3 API out of the box, our next phase will use a native S3 storage. Our Ceph deployment will consist of erasure coded storage nodes to maximize storage potential along with multiple Ceph Object Gateways for redundant access. We will compare performance of our next stage implementations. Finally, we will present how to leverage OpenID Connect with the Ceph Object Gateway’s to enable Federated ID access.

We hope this contribution will serve the community needs as we move forward with cross-lab collaborations and the need for Federated ID access to distributed compute facilities.

Consider for long presentation No

Primary authors

Poat, Michael (BNL) LAURET, Jerome (Brookhaven Science Associates) Rao, Tejas (Brookhaven National Laboratory)

Presentation materials

Peer reviewing

Paper