Indico is back online after maintenance on Tuesday, April 30, 2024.
Please visit Jefferson Lab Event Policies and Guidance before planning your next event: https://www.jlab.org/conference_planning.

May 8 – 12, 2023
Norfolk Waterside Marriott
US/Eastern timezone

Deploying a machine learning model catalog at CERN

Not scheduled
1h
Hampton Roads Ballroom and Foyer Area (Norfolk Waterside Marriott)

Hampton Roads Ballroom and Foyer Area

Norfolk Waterside Marriott

235 East Main Street Norfolk, VA 23510
Poster Poster Poster Session

Speaker

Mr Da Costa Cardoso, Renato Paulo (CERN)

Description

With the extended usage of machine learning models, more and more complex algorithms are being studied. On the one hand, the development and optimisation processes become more challenging, on the other hand studies about models generalisation and re-usability become interesting. In this context, efficient, flexible ways to track continuous changes during development, as well as relevant performance metrics for and during training, that helps future users use or change the model become essential. The Oracle Accelerated Data Science SDK (ADS) is a Python library, part of the Oracle Cloud Infrastructure (OCI) Data Science, that intuitively helps the user expand their work on common data science tasks. Oracle ADS simplifies the access to the OCI Data Science Service Model Catalog to store, evaluate and train a machine learning model while offering key features during the whole training pipeline. As part of a CERN initiative dedicated at exploring and understanding the limitation and strengths of different state-of-the-art frameworks dedicated to machine learning, to design and optimise ML dedicated infrastructure, we explored the functionalities of Oracle ADS and the Oracle Cloud Infrastructure Data Science using a complex hybrid auto- encoder/GAN model. We investigate their integrated notebook training environment with custom Conda packages, supporting both CPUs and GPUs, and extend this functionality to save the model under the model catalog and further deploy it for continuous development. We compare its functionalities to other commonly used tools (Kubeflow-based), looking into OCI Data Science Jobs and Pipelines, which extend the functionalities of the training process. We finally discuss the challenges related to Oracle ADS Model Catalog integration within the CERN computing infrastructure.

Consider for long presentation No

Primary authors

Presentation materials