Please visit Jefferson Lab Event Policies and Guidance before planning your next event:
May 8 – 12, 2023
Norfolk Waterside Marriott
US/Eastern timezone

Triggerless data acquisition pipeline for Machine Learning based statistical anomaly detection

May 9, 2023, 3:15 PM
Marriott Ballroom V-VI (Norfolk Waterside Marriott)

Marriott Ballroom V-VI

Norfolk Waterside Marriott

235 East Main Street Norfolk, VA 23510
Oral Track 2 - Online Computing Track 2 - Online Computing


Migliorini, Matteo (Padova University & INFN)


The sensitivity of modern HEP experiments to New Physics (NP) is limited by the hardware-level triggers used to select data online, resulting in a bias in the data collected. The deployment of efficient data acquisition systems integrated with online processing pipelines is instrumental to increase the experiments' sensitivity to the discovery of any anomaly or possible signal of NP. In designing such systems the combination of heterogeneous processing elements, including FPGAs and GPUs, is foreseen to sustain the large throughput of raw data from the detectors.
In this work, we present the first implementation of an end-to-end infrastructure that acquires continuously data from an experimental setup and processes it online looking for statistical anomalies using a Machine Learning (ML) technique. The infrastructure is deployed at the INFN Legnaro National Laboratory (LNL) and reads out data from a reduced-sized version of the drift tube muon detector of the CMS experiment at CERN. The data stream is first processed by an FPGA to cluster signals associated with the passage of a muon through the detector and produce candidate stubs. Candidate events are then reconstructed and all muon hits and the reconstructed muon stubs are analyzed online by an algorithm deployed on a GPU to perform unbiased data exploration and statistical anomaly detection. The New Physics Learning Machine (NPLM) technique is used to evaluate the compatibility between incoming batches of experimental data and a reference sample representing the normal behavior of the data. In the specific case of the LNL test stand, the NPLM algorithm uses as a reference sample a dataset gathered in nominal detector conditions; data deviations from the normal behavior, if detected, are characterized and then mapped to known sources of detector malfunctioning with some degree of confidence. Unexpected behaviors, that might signal the presence of New Physics, can be singled out if the observed discrepancy doesn't match any of the expected anomalies. The system is currently dealing with the limited throughput originated by the cosmic muon flux; nevertheless, all components of the readout chain are designed to scale up and be eventually employed in experiments at the LHC.
In this contribution, we describe the technical implementation of the online processing pipeline and assess the performance of its most critical components.

Consider for long presentation No

Primary authors

Grosso, Gaia (Universita e INFN, Padova (IT)) Lai, Nicolò (Padova University) Migliorini, Matteo (Padova University & INFN) Pazzini, Jacopo (Padova University & INFN) Triossi, Andrea (Padova University & INFN) Zanetti, Marco (University of Padova) Zucchetta, Alberto (Padova INFN)

Presentation materials

Peer reviewing