Speaker
Description
The Vera C. Rubin Observatory, currently in construction in Chile, will start performing the Large Survey of Space and Time (LSST) late 2024 for 10 years. Its 8.4-meter telescope will survey the southern sky in less than 4 nights in six optical bands, and repeatedly generate about 2000 exposures per night, corresponding to a data volume of about 20 TB every night. Three data facilities are preparing to contribute to the production of the annual data releases: the US Data Facility (USDF) will process 25% of the raw data, the UK data facility (UKDF) will process 25% of the raw data and the French data facility (FrDF), operated by CC-IN2P3, that will locally process the remaining 50% of the raw data.
In the context of the Data Preview 0.2 (DP0.2), the Data Release Production (DRP) pipelines have been executed on the DC-2 simulated dataset (generated by the DESC collaboration, DESC). This dataset includes 20 000 simulated exposures, representing 300 square degrees of Rubin images with a typical depth of 5 years.
DP0.2 ran at the interim data facility (based on Google cloud), and the full exercise was replicated at CC-IN2P3. During this exercise, 3 PiB of data and more than 200 million files have been produced. In this contribution we will present a detailed description of the system that we set up to perform this processing campaign using CC-IN2P3's computing and storage infrastructure. Several topics will be addressed: workflow generation and execution, batch job submission, memory and I/O requirements, operations, etc. We will focus on the issues that arose during this campaign and how they have been addressed and will present the lessons learnt from this exercise.
Consider for long presentation | No |
---|