The original HLT framework used in the Belle II experiment was formaly upgraded replacing the old IPC based ring buffer with the ZeroMQ data transport to overcome the unexpected IPC locking problem. The new framework has been working stably in the beam run so far, but it lacks the capability to recover the processing fault without stopping the on-going data taking. In addition, the compatibility with the offline framework (basf2) was lost which was maintained in the original.
In order to solve these, an improved core processing framework is developed based on basf2 running on each of worker servers, while keeping the existing ZeroMQ data transport between the servers unchanged. The new core framework is implemented with a lock-free 1-to-N and N-to-1 data transport using ZeroMQ utilizing the IPC socket so that it keeps a 100% compatibility with the original ring-buffer based offline framework. When a processing fault occurs, the affected currently processing event is salvaged from the input buffer and sent directly to the output using ZeroMQ broadcast. The terminated process is automatically restarted without stopping data taking.
This contribution describes the detail of the improved Belle II HLT frameowrk with the result of the performance test in the real Belle II DAQ data flow.
|Consider for long presentation||No|