26TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY & NUCLEAR PHYSICS (CHEP2023)

Name: 26TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY & NUCLEAR PHYSICS (CHEP2023)
Start: 2023-05-08T08:00:00-04:00
End: 2023-05-12T16:00:00-04:00
Location: Norfolk Waterside Marriott

May 8 – 12, 2023

Norfolk Waterside Marriott

US/Eastern timezone

Conference Secretariat

chep2023-secretariat@jlab.org

Distributed Machine Learning with PanDA and iDDS in LHC ATLAS

May 8, 2023, 11:00 AM

15m

Marriott Ballroom II-III (Norfolk Waterside Marriott)

Marriott Ballroom II-III

Norfolk Waterside Marriott

235 East Main Street Norfolk, VA 23510

Oral Track 4 - Distributed Computing Track 4 - Distributed Computing

Weber, Christian (Brookhaven National Laboratory)

Machine learning has become one of the important tools for High Energy Physics analysis. As the size of the dataset increases at the Large Hadron Collider (LHC), and at the same time the search spaces become bigger and bigger in order to exploit the physics potentials, more and more computing resources are required for processing these machine learning tasks. In addition, complex advanced machine learning workflows are developed in which one task may depend on the results of previous tasks. How to make use of vast distributed CPUs/GPUs in WLCG for these big complex machine learning tasks has become a popular area. In this presentation, we will present our efforts on distributed machine learning in PanDA and iDDS (intelligent Data Delivery Service). We will at first address the difficulties to run machine learning tasks on distributed WLCG resources. Then we will present our implementation with DAG (Directed Acyclic Graph) and sliced parameters in iDDS to distribute machine learning tasks to distributed computing resources to execute them in parallel through PanDA. Next we will demonstrate some use cases we have implemented, such as Hyperparameter Optimization, Monte Carlo Toy confidence limits calculation and Active Learning. Finally we will describe some directions to perform in the future.

Consider for long presentation	No

De, Kaushik (University of Texas at Arlington) Guan, Wen (Brookhaven National Laboratory) Karavakis, Edward (Brookhaven National Laboratory) Klimentov, Alexei (Brookhaven National Laboratory) Lin, Fa-Hui (University of Texas at Arlington) Maeno, Tadashi (Brookhaven National Laboratory (US)) Megino, Fernando Harald Barreiro (The University of Texas at Arlington) Nilsson, Paul (Brookhaven National Laboratory) Weber, Christian (Brookhaven National Laboratory) Wenaus, Torre (BNL) Yang, Zhaoyu (Brookhaven National Laboratory) Zhang, Rui Zhao, Xin (Brookhaven National Laboratory (US))

dml_idds_20230405 - annotated draft 2.pptx

iDDS_ML_CHEP23_v2.pdf

26TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY & NUCLEAR PHYSICS (CHEP2023)

Conference Secretariat

Distributed Machine Learning with PanDA and iDDS in LHC ATLAS

Marriott Ballroom II-III

Norfolk Waterside Marriott

Speaker

Description

Authors

Presentation materials

Peer reviewing

Paper

Choose timezone

26TH INTERNATIONAL CONFERENCE ON COMPUTING IN HIGH ENERGY & NUCLEAR PHYSICS (CHEP2023)

Conference Secretariat

Speaker

Description

Authors

Presentation materials

Peer reviewing

Paper