Speaker
Description
WLCG relies on the network as a critical part of its infrastructure and therefore needs to guarantee effective network usage and prompt detection and resolution of any network issues, including connection failures, congestion and traffic routing. The OSG Networking Area, in partnership with the WLCG Throughput working group, has created a monitoring infrastructure that gathers metrics from the global WLCG perfSONAR deployment and centrally stores the measurements in Elasticsearch.
In this presentation we will describe the ongoing work to proactively analyze, correlate and alert on various network and infrastructure issues. We will discuss the applied methods and techniques, the developed systems, as well as the challenges with the measurements that make it difficult to easily identify problems or to assign those problems to the appropriate location(s). Lastly we will describe our future plans to incorporate AI/ML by appropriately annotating data based upon the types of issues discovered.
Consider for long presentation | No |
---|