Large-scale research facilities are becoming prevalent in the modern scientific landscape. One of these facilities' primary responsibilities is to make sure that users can process and analysis measurement data for publication. To allow for barrier-less access to those highly complex experiments, almost all beamlines require fast feedback capable of manipulating and visualizing data online to offer convenience for the decision process of the experimental strategy. And recently, the advent of beamlines at fourth-generation synchrotron sources and high resolution with high sample rate detector has made significant progress that pushes the demand for computing resources to the edge of current workstation capabilities. On top of this, most synchrotron light sources have shifted to prolonged remote operation because of the outbreak of a global pandemic, with the need for remote access to the online instrumental system during the operation. Another issue is the vast data volume produced by specific experiments makes it difficult for users to create local data copies. In this case, on-site data analysis services are necessary both during and after experiments.
Some state-of-the-art experimental techniques, such as phase-contrast tomography and ptychography approaches, will be deployed. However, it poses a critical problem of integrating this algorithmic development into a novel computing environment used in the experimental workflow. The solution requires collaboration with the user research groups, instrument scientists and computational scientists. A unified software platform that provides an integrated working environment with generic functional modules and services is necessary to meet these requirements. Scientists can work on their ideas, implement the prototype and check the results following some conventions without dealing with the technical details and the migration between different HPC environments. Thus, one of the vital considerations is integrating extensions into the software in a flexible and configurable way. Another challenge resides in the interactions between instrumental sub-systems, such as control system, data acquisition system, computing infrastructures, data management system, data storage system, and so on, which can be quite complicated.
In this paper, we propose a platform named Daisy for integration and automation across services and tools, which ties together existing computing infrastructure and state-of-the-art algorithms. With modular architecture, it comprises loosely coupled algorithm components that communicate over the heterogeneous in-memory data store and scales horizontally to deliver automation at scale based on Kubernetes. The applications for the different scientific domains of HEPS developed based on the platform will also be introduced.
|Consider for long presentation||Yes|