Please visit Jefferson Lab Event Policies and Guidance before planning your next event:
May 8 – 12, 2023
Norfolk Waterside Marriott
US/Eastern timezone

Machine learning based compression for scientific data

May 8, 2023, 2:15 PM
Hampton Roads VII (Norfolk Waterside Marriott)

Hampton Roads VII

Norfolk Waterside Marriott

235 East Main Street Norfolk, VA 23510
Oral Track 9 - Artificial Intelligence and Machine Learning Track 9 - Artificial Intelligence and Machine Learning


Gallén, Axel (Lund University (SE)) Ekman, Alexander (Lund University (SE))


One common issue in vastly different fields of research and industry is the ever-increasing need for more data storage. With experiments taking more complex data at higher rates, the data recorded is quickly outgrowing the storage capabilities. This issue is very prominent in LHC experiments such as ATLAS where in five years the resources needed are expected to be many times larger than the storage available (assuming a flat budget model and current technology trends) [1]. Since the data formats used are already highly compressed, storage constraints could require more drastic measures such as lossy compression, where some data accuracy is lost during the compression process.

In our work, following from a number of undergraduate projects [2,3,4,5,6,7], we have developed an interdisciplinary open-source tool for machine learning-based lossy compression. The tool utilizes an autoencoder neural network, which is trained to compress and decompress data based on correlations between the different variables in the dataset. The process is lossy, meaning that the original data values and distributions cannot be reconstructed precisely. However, for certain variables and observables where the precision loss is tolerable, the high compression ratio allows for more data to be stored yielding greater statistical power.

[1] -
[2] -
[3] -
[4] -
[5] -
[6] -
[7] -

Consider for long presentation No

Primary authors

Gallén, Axel (Lund University (SE)) Ekman, Alexander (Lund University (SE))


Jawahar, Pratik (University of Manchester) Doglioni, Caterina

Presentation materials