Please visit Jefferson Lab Event Policies and Guidance before planning your next event: https://www.jlab.org/conference_planning.

May 8 – 12, 2023
Norfolk Waterside Marriott
US/Eastern timezone

Results from Large-scale HPC deployment of Scalable CyberInfrastructure for Artificial Intelligence and Likelihood Free Inference (SCAILFIN)

May 11, 2023, 12:00 PM
15m
Marriott Ballroom IV (Norfolk Waterside Marriott)

Marriott Ballroom IV

Norfolk Waterside Marriott

235 East Main Street Norfolk, VA 23510
Oral Track 7 - Facilities and Virtualization Track 7 - Facilities and Virtualization

Speaker

Dr Hurtado Anampa, Kenyi (University of Notre Dame)

Description

The NSF-funded Scalable CyberInfrastructure for Artificial Intelligence and Likelihood Free Inference (SCAILFIN) project has developed and deployed artificial intelligence (AI) and likelihood-free inference (LFI) techniques and software using scalable cyberinfrastructure (CI) built on top of existing CI elements. Specifically, the project has extended the CERN-based REANA framework, a cloud-based data analysis platform deployed on top of Kubernetes clusters that was originally designed to enable analysis reusability and reproducibility. REANA is capable of orchestrating extremely complicated multi-step workflows, and uses Kubernetes clusters both for scheduling and distributing container-based workloads across a cluster of available machines, as well as instantiating and monitoring the concrete workloads themselves. This work describes the the components that were developed in order to enable large scale deployment on High Performance Computing (HPC) resources. Scaling and performance results using large-scale MadMiner AI/LFI training workflows on a variety of large HPC sites will be presented.

Consider for long presentation No

Primary authors

Hildreth, Michael (University of Notre Dame (US)) Dr Hurtado Anampa, Kenyi (University of Notre Dame)

Presentation materials