Please visit Jefferson Lab Event Policies and Guidance before planning your next event: https://www.jlab.org/conference_planning.

Indico is being upgraded to version 3.3.4 on October 15, 2024. There are no impacts to events expected. There are no major feature updates – only minor feature improvements and bugfixes. See the news link for more information.

May 8 – 12, 2023
Norfolk Waterside Marriott
US/Eastern timezone

Lightweight Distributed Computing System Oriented to LHHAASO Data Processing

May 11, 2023, 2:30 PM
15m
Marriott Ballroom II-III (Norfolk Waterside Marriott)

Marriott Ballroom II-III

Norfolk Waterside Marriott

235 East Main Street Norfolk, VA 23510
Oral Track 4 - Distributed Computing Track 4 - Distributed Computing

Speaker

CHENG, Yaodong (IHEP, CAS)

Description

The Large High Altitude Air Shower Observatory (LHAASO) is a large-scale astrophysics experiment led by China. The offline data processing was highly dependent on the Institute of High Energy Physics(IHEP) local cluster and the local file system.
As the LHAASO experimental cooperation groups’ resources are located geographically and most of them have the characteristics of limited scale, low stability, and lack of human support, it is difficult to integrate them via Grid. We designed and developed a lightweight distributed computing system for LHAASO offline data processing. Unlike the grid model, the system keeps the IHEP cluster as the main cluster and extends the cluster to the worker nodes of the remote site. LHAASO jobs are submitted to the IHEP cluster and are dispatched to the remote worker node in the system.
Tokens are the authentication and authorization solution in the whole cluster, LHAASO computing tasks are classified into several types. Each type of job is wrapped by a dedicated script which helps the job have no direct access to the IHEP file system. The system draws on the idea of “startd automatic cluster joining” of GlideinWMS but abandons the grid certificate authentication.
About 125 worker nodes with 4k CPU cores at the remote site have been joined into IHEP LHAASO cluster by the distributed computing system and provided LHAASO job to produce 700TB simulation data in 6 months.

Consider for long presentation Yes

Primary authors

Shi, Jingyan Mr Jiang, Xiaowei (Institute of High Energy Physics) Mr Guo, Chaoqi (Institute of High Energy Physics) Dr Du, Ran (Institute of High Energy Physics) CHENG, Yaodong (IHEP, CAS)

Presentation materials