Field Programmable Gate Arrays (FPGAs) are playing an increasingly important role in the sampling and data processing industry due to their intrinsically highly parallel architecture, low power consumption, and flexibility to execute custom algorithms. In particular, the use of FPGAs to perform machine learning inference is increasingly growing thanks to the development of high-level synthesis projects that abstract the complexity of HDL programming.
In this presentation we will describe our experience extending KServe predictors, an emerging standard for ML (Machine Learning) model inference as a service on kubernetes. This project will support a custom workflow capable of loading and serving models on-demand on top of FPGAs. A key aspect is that the proposed approach makes the firmware generation transparent, often an obstacle to a widespread FPGA adoption. We will detail how the proposed system automates both the synthesis of the HDL code and the generation of the firmware, starting from a high-level language and user-friendly machine learning libraries. The ecosystem is then completed with the adoption of a common language for sharing user models and firmwares, that is based on a dedicated Open Container Initiative artifact definition, thus leveraging all the well established practices on managing resources on a container registry.
|Consider for long presentation||No|