# Streaming Readout Development for CODA @ JLab

Streaming Readout X – May 17-19, 2022 Hybrid Meeting

**David Abbott** 

Ben Raydo

**FEDAQ Group** 

Jefferson Lab – Physics Division



#### **Data Acquisition at JLab**

- At JLab we have 4 Experimental Halls, all running with different detectors and physics priorities.
  - Of course all are having increased demands for the DAQ.
- Experiments are increasingly reliant custom electronics to interface detectors and digitize signals.
  - -ASICs and FPGAs are becoming the norm (and the future) for the front-end.
  - -But older hardware is still relevant and useful (particularly for starving budgets)
- Our goal is to support both the traditional Triggered model along with the Streaming model within one integrated DAQ framework.
  - -Leverage existing hardware to implement streaming
  - -Add support for new electronics
  - Try to make it as seamless and user friendly as possible



### The CODA Data Acquisition Toolkit



## VXS Standard (VITA 41)



- JLab standardized on this technology for the 12GeV Upgrade
  Originally used for the L1 trigger data path
- Dual Star switched serial backplane (along with original VME)
- Up to 20Gb (4 lanes) from each Payload to the 2 Switch slots (A, B)
- Up to 18 Payload slots are available
- Easy distribution of Trigger, Sync and low jitter clock to all modules in the crate.



#### VXS Trigger Processor (VTP)

- Relieve the ROC of all of the "Readout" tasks and implement them in the FPGAs.
- Triggered or Streaming readout from All payload modules in parallel
- This requires the payload modules to have some intelligence/programmability and serial link capability (e.g. FPGA-based).
- The Software ROC now is primarily responsible only for Configure, Control and Monitoring the VTP-Based DAQ.



#### JLAB – VTP Board

Linux OS on the Zync-7030 SoC (2-core ARM 7L , 1GB DDR3) 10/40Gbps Ethernet option (runs the CODA ROC)

Xilinx Virtex 7 FPGA

Serial Lanes from both the VXS backplane and the Front panel 4GB DDR3 RAM







#### JLAB Clock and Trigger Distribution System



### **Timing System Components**





### JLAB FADC – Streaming mode

A 250 MHz FADC generates a 12 bit sample every 4ns. That's 3 Gb/s for one channel. 16 channels is 48 Gb/s. Currently, we identify a threshold crossing (hit) and integrate charge over a ROI and send only a sum and timestamp for each hit.

Available bandwidth will allow for 1 hit every 32ns from all channels. A data frame (Time Slice) for all available hits is generated in the VTP every 65µs





The next revision to the firmware will have an option for full ROI wave forms to be streamed, but this will allow possible dropped hits due to bandwidth limitations

The FADC can still simultaneously operate in triggered mode with an 8µs pipeline and 2µs readout window.



### **FADCs - Triggered vs Streaming**



PL: Programmed Lookback PTW: Time window

Data we get on a trigger:

- FADC waveform values for the ROI
- Threshold Sample # (hit time)
- Trigger absolute time stamp





1 Frame = N Clocks (up to 16bits, currently 65536 ns)

#### Data we get for a Frame:

- Pedestal subtracted sums over an ROI for every hit over threshold
- Threshold sample # fine time stamp for each hit
- Frame # and absolute time stamp for the frame



#### **Stream Aggregation – Data formats**



Jefferson Lab

#### **Beam Tests**

- Recent beam tests with a calorimeter prototype at DESY (*Thanks to Doug Hasell for coordinating this opportunity*)
- 5x5 PbWO4 Crystal Array (2 cm<sup>2</sup> face) with 2-5GeV electron test beam
- Jlab 250Mhz FADC boards
  - Triggered data are waveforms read out over VME bus.
  - Stream data are integrated sums and times of all hits over a threshold in the calorimeter regardless of the trigger status.





Jefferson Lab

#### Simple Hybrid CODA System



#### **Beam Tests**

Calorimeter spectra – Streaming Data (Electron beam centered on the central crystal)

#### **Time Slice Frame containing a Trigger**



#### Beam Tests cont...

- Rates were relatively low for these tests
- The electron beam varied in current depending on if they were filling the PETRA synchrotron.
- Note the respective data rates on Run Control.
  - Streaming frame and data rates are relatively constant (15.2kHz, ~1.3MB/s)
  - Triggered rates rise and fall as the electron beam comes and goes (reading out wave forms even in triggered mode generates a lot of data)



Jefferson Lab

### "Zero" Suppression

- In the streaming environment we have to mange these the two extremes
  - Empty time frames
  - Too much data for available bandwidth
- For the current Streaming ROC format there is a minimum 72 bytes/frame sent. So for 65µs frames that comes to ~1.1MB/s "empty" data rate.
- If there is no Payload data then they are not present. AIS entries establish they were empty.
- EVIO Header (32 bytes) is just for transport to the Aggregator then stripped.

#### Empty Frames for the Beam Tests)

| TET (threshold<br>over pedestal) |                                                                         | Empty Frames |                                                                                   | Frames with<br>Triggers |
|----------------------------------|-------------------------------------------------------------------------|--------------|-----------------------------------------------------------------------------------|-------------------------|
| 10                               |                                                                         | 68.9%        |                                                                                   | 0.084%                  |
| 50                               |                                                                         | 97.0%        |                                                                                   | 0.64%                   |
|                                  | EVIO Header<br>Length (words)<br>ROC ID DT SS<br>TSS<br>AIS<br>PP 1<br> |              | Minimum 40 bytes<br>for up to 2<br>payloads.<br>Total 68 bytes for 16<br>payloads |                         |



### **Congestion Management**

- Too much data is handled both locally an globally
- Globally "Sync" is used to start and stop all the streams at their source (FADCs)
   A "Busy" generated by any ROC can feed back and inhibit all streams for all ROCs
- Locally, PPs stream hits to a VTP DRAM buffer. The Frame Builders have a frame "fifo"
- A Frame builder will only aggregate hits into a ROC frame if the fifo is not full. Otherwise the frame is dropped.
- Built ROC frames at the Zync will always get sent (eventually).
- Up to 4 independent network streams can be defined. Each PP maps to a specific output stream.
- Limited Zync resources only provide 2 high performance TCP connections (or 4 UDP streams)



#### Note:

**TCP Performance** 

~7-8 Gbps per link without frame drops

UDP performance (9000 MTU) >9.5Gbps per link without frame drops ~50% CPU utilization for a single stream

(These tests were done with both VTP and a Server connected through a single switch)



#### Some general observations...

- The design of our Clock/Trigger/Sync/Busy distribution system is critical to the flexibility and functionality of the CODA Hybrid DAQ.
- The VXS platform works for JLAB, but is an impractical and/or financial overhead for small university groups. The same could be said for other solutions like the FELIX/DAM architecture for EIC.
- It seems there is a need for an affordable COTs or alternative "lightweight" solution which supports Streaming that can be made available to the community for development and test systems.

The critical component to just about any system is a System on a Chip with enough resources to support at least a few Front End electronics serial link protocols and perform 1<sup>st</sup> stage hardware stream aggregation. And present the data to the next stage in a standardized way.





#### Summary

- We have successfully started integration of Streaming support within the CODA software framework and supported hardware.
- The Hybrid DAQ system give us a lot of flexibility to support older hardware within a Triggered system as well as newer hardware that can conform to the the Streaming requirements.
- UDP data transport from the VTP is proving to be reliable and the most efficient method for getting data to backend processing.
- Upgrades to JLAB FADC firmware this Summer will allow for more streaming options including waveforms.
- The ability to take both Triggered and Streaming data simultaneously should provide useful data for Online processing to better define efficient event identification algorithms as part of a high level trigger.
- Integration of other ASIC-based front-end electronics within the CODA streaming environment still needs to be developed.

