Please visit Jefferson Lab Event Policies and Guidance before planning your next event: https://www.jlab.org/conference_planning.

May 8 – 12, 2023
Norfolk Waterside Marriott
US/Eastern timezone

Enabling Storage Business Continuity and Disaster Recovery with Ceph distributed storage

May 8, 2023, 11:45 AM
15m
Norfolk Ballroom III-V (Norfolk Waterside Marriott)

Norfolk Ballroom III-V

Norfolk Waterside Marriott

235 East Main Street Norfolk, VA 23510
Oral Track 1 - Data and Metadata Organization, Management and Access Track 1 - Data and Metadata Organization, Management and Access

Speaker

Bocchi, Enrico (CERN)

Description

The Storage Group in the CERN IT Department operates several Ceph storage clusters with an overall capacity exceeding 100 PB. Ceph is a crucial component of the infrastructure delivering IT services to all the users of the Organization as it provides: i) Block storage for the OpenStack infrastructure, ii) CephFS used as persistent storage by containers (OpenShift and Kubernetes) and as shared filesystems by HPC clusters, and iii) S3 object storage for cloud-native applications, monitoring, and software distribution across the WLCG.

The Ceph infrastructure at CERN has been rationalized and restructured to offer storage solutions for high(er) availability and Disaster Recovery / Business Continuity. In this contribution, we give an overview of how we transitioned from a single RBD zone to multiple ones enabling Storage Availability zones and how RBD mirroring functionalities available in Ceph upstream have been hardened. Also, we illustrate future plans for storage BC/DR including backups via restic to S3 and Tape, replication of objects across multiple storage zones, and the instantiation of clusters spanning different computing centres.

Consider for long presentation No

Primary authors

Presentation materials

Peer reviewing

Paper