Making the large datasets collected at the LHC accessible to the public is a considerable challenge given the complexity and volume of data. Yet to harness the full scientific potential of the facility, it is essential to enable meaningful access to the data by the broadest physics community possible. Here we present an application, the LHCb Ntuple Wizard, which leverages the existing computing infrastructure available to the LHCb collaboration in order to enable third-party users to request derived data samples in the same format used in LHCb physics analysis. An intuitive user interface built with the React-JS framework allows for the discovery of available particle or decay channel datasets through a flexible search engine, and guides the user through the request for producing Ntuples: collections of N particle or decay candidates, each candidate corresponding to a tuple cataloguing measured quantities chosen by the user. Necessary documentation and metadata is rendered in the appropriate context within the application to guide the user through the core components of the application, dataset discovery and Ntuple configuration. In the Ntuple configuration step, decays are represented by an interactive directed acyclic graph where the nodes depict (intermediate) particles and the edges indicate a mother-daughter relationship, each graph corresponding to the configuration of a single Ntuple. Standard tools used at LHCb for saving measured or derived quantities to Ntuples can be applied to specific nodes, or collections of nodes, allowing for customization of information saved about the various subsamples used to build the physics candidate (e.g. various particles in a decay). Ntuples in this context are saved as simply structured ROOT files containing the catalogued quantities, requiring no external usage of the LHCb software stack. Issues of computer security and access control arising from offering this service are addressed by keeping the configuration output of the Ntuple Wizard in a pure data structure format (YAML) to be interpreted by internal parsers. The parsers produce the necessary Python scripts for steering the Ntuple production job, the output of which will be delivered to the CERN Open Data Portal.
|Consider for long presentation||No|