



# Simulation and Modeling Topic on EIC Data Streaming/Computing Infrastructure

Kuan-Chieh Hsu

Computing and Data Sciences directorate, BNL 10/8/2025



#### **Outline**

- EIC as a modeling target
- Toolbox explained: modeling as a tool
- The current modeling design
- Preliminary modeling results
- Conclusion and plans



## EIC as the Modeling Target

- Echelon-0 and 1 computing infrastructure
- Focus on communication and computation aspects of EIC
  - Communication example: transmission flow from Frame Buffer to Echelon 1's
  - Computation example: compose TFs into STF at Frame Buffer node





#### **Modeling Toolbox Overview**

More accurate



|                         |                                                                |                                                                     | · · · · · · · · · · · · · · · · · · ·                                                                   |
|-------------------------|----------------------------------------------------------------|---------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------|
|                         | High level                                                     | Medium level                                                        | Low level                                                                                               |
| Computation             | # operations, # memory accesses Peak compute/memory throughput | Algorithm, profiling info (# cache accesses, # FP multiplications,) | Program source code and execution trace Detailed hardware configuration (cache size, # function units,) |
| Communication (network) | Data size<br>End-to-end bandwidth                              | Stream size, arrival time<br>Topology, link bandwidth               | Packet trace Detailed switch/link configuration, congestion protocol                                    |

- Detail level selection depends on:
  - modeling purpose
  - specification/ parameters availability
  - desired fidelity level

## Artificial Intelligent (AI) techniques facilitate performance modeling that guides future computer system design.

- Li, Lingda, Thomas Flynn, and Adolfy Hoisie. "Learning Generalizable Program and Architecture Representations for Performance Modeling." *SC24: International Conference for High Performance Computing, Networking, Storage and Analysis*. IEEE, 2024.
- Li, Lingda, et al. "Simnet: Accurate and high-performance computer architecture simulation using deep learning." *Proceedings of the ACM on Measurement and Analysis of Computing Systems* 6.2 (2022): 1-24.
- K. J. Barker, J. Sancho, D. J. Kerbyson, K. Davis, S. Pakin, A. Hoisie, and M. Lang, "Using performance modeling to design large-scale systems," Computer, vol. 42, no. 11, pp. 42–49, nov 2009.



### **Modeling Usage for EIC**

- Design Space Exploration (DSE) on E0 and E1 with modeling
  - Models can predict a wide range of performance metrics and their associated uncertainties.
  - DSE can find the optimal design with given requirements.
    - Ex: What is the optimal # of switches/links?
    - Ex: What is the smallest possible buffer sizes of frame buffer and buffer box?
    - Ex: Balance the CPU-to-storage ratio while considering the detail component composition.
  - DSE can guide future software optimization
    - Ex: Assess feasibility and benefits of code migration between CPU and GPU.
- Address key research questions.
  - Ex: Finding the maximal streaming bandwidth across different data patterns under specific network configurations. (beyond upper/lower bound estimation)
  - Ex: What network configuration enables streaming X% of data (TFs) efficiently.



## The Current Event-based Modeling



Echelon 0 computing/networking (SDCC enclave)

- Memory model: TFs are memory objects only.
- Storage model: STFs will reside in storage as files.
- Network modules: We give different parameters for different network modules.<sub>6</sub>



## **Example Configuration**

#### Network

- Frame Builder to Buffer Box: 1.25 TB/s, latency: 0.1 ms
- Buffer Box to BNL: 50 MB/s, latency: 0.1 ms
- Buffer Box to JLAB: 50 MB/s, latency: 11 ms

#### STF

- 1000 TFs per STF
- STF construction delay: 0.2 s
- Bubber Box to BNL: latency: 50 ms
- Buffer Box to JLAB: latency: 200 ms

#### Frame builder

- Memory model: 1 TB capacity
- Storage model: 100 TB capacity, read/write bandwidth: 5 GB/s, read/write latency: 1 ms

#### Buffer Box

- Memory model: 1 TB capacity
- Storage model: 100 TB capacity, read/write bandwidth: 10 GB/s, read/write latency: 1 ms



Current modeling design is at high-level

### **Example Input Pattern**





- (grey) Time delta of two consecutive TFs forms an arithmetic sequence
- (red) The actual timestamps added Poisson distribution shifts.

#### **Simulation Results**





Arrival Times at BNL (TF & STF)

#### **Conclusion and Plans**

- Modeling as a tool enables design space exploration for EIC.
- Preliminary results of the current EIC modeling.

 Explore and propose networking and computing architectural designs that deliver high performance, resilience, and optimization tailored to the specific requirements of EIC.



## Backup slides



## **Latency Summary**

| data_type | destination | count   | mean_s | p50_s  | p90_s  | p99_s  |
|-----------|-------------|---------|--------|--------|--------|--------|
| STF       | BNL         | 100     | 1.3027 | 1.3120 | 1.3145 | 1.3169 |
| STF       | JLAB        | 100     | 1.4527 | 1.4620 | 1.4645 | 1.4669 |
| TF        | BNL         | 100,000 | 0.3152 | 0.3065 | 0.5096 | 0.6011 |
| TF        | JLAB        | 100,000 | 0.3153 | 0.3065 | 0.5096 | 0.6011 |
| STF       | ALL         | 200     | 1.3777 | 1.3560 | 1.4638 | 1.4657 |
| TF        | ALL         | 200,000 | 0.3153 | 0.3065 | 0.5096 | 0.6011 |



#### **Event Prediction in Motion**





# EIC Streaming DAQ/Computing Architecture

#### **Collider Characteristics**

- 1260 Bunches arriving at 98.5Mhz (10.15ns bunch separation)
- 1.015us abort gaps (100 bunches)
- $\sqrt{s} \Rightarrow 20 141 \text{ GeV}$
- $\mathcal{L}_{max} \Rightarrow 10^{34} \text{ cm}^{-2} \text{ s}^{-1}$
- Electron, proton, and light nuclei beams can be polarized
- Each bunch can have different polarization states
  - DAQ must tag data to specific bunch crossings
  - Need to track luminosity for each bunch crossing

#### **Physics Performance**

- Maximum DIS rate ~500kHz
- Large number of Channels
- Low occupancy



# Bandwidth Usages





## Memory/ Storage Usages



