Please visit Jefferson Lab Event Policies and Guidance before planning your next event: https://www.jlab.org/conference_planning.

May 8 – 12, 2023
Norfolk Waterside Marriott
US/Eastern timezone

Demand-driven provisioning of Kubernetes-like resource in OSG

May 8, 2023, 12:00 PM
15m
Marriott Ballroom IV (Norfolk Waterside Marriott)

Marriott Ballroom IV

Norfolk Waterside Marriott

235 East Main Street Norfolk, VA 23510
Oral Track 7 - Facilities and Virtualization Track 7 - Facilities and Virtualization

Speakers

Andrijauskas, Fabio Schultz, David (IceCube, University of Wisconsin-Madison)

Description

The OSG-operated Open Science Pool is an HTCondor-based virtual cluster that aggregates resources from compute clusters provided by several organizations. A user can submit batch jobs to the OSG-maintained scheduler, and they will eventually run on a combination of supported compute clusters without any further user action. Most of the resources are not owned by, or even dedicated to OSG, so demand-based dynamic provisioning is important for maximizing usage without incurring excessive waste.
OSG has long relied on GlideinWMS for most of its resource provisioning needs, but is limited to resources that provide a Grid-compliant Compute Entrypoint. To work around this limitation, the OSG software team had developed a pilot container that resource providers could use to directly contribute to the OSPool. The problem of that approach is that it is not demand-driven, relegating it to backfill scenarios only.
To address this limitation, a demand-driven direct provisioner of Kubernetes resources has been developed and successfully used on the PRP. The setup still relies on the OSG-maintained backfill container images, it just automates the provisioning matchmaking and successive requests. That provisioner has also been recently extended to support Lancium, a green computing cloud provider with a Kubernetes-like proprietary interface. The provisioner logic had been intentionally kept very simple, making this extension a low cost project.
Both PRP and Lancium resources have been provisioned exclusively using this mechanism for almost a year with great results.

Consider for long presentation No

Primary authors

Presentation materials

Peer reviewing

Paper