Please visit Jefferson Lab Event Policies and Guidance before planning your next event:
May 8 – 12, 2023
Norfolk Waterside Marriott
US/Eastern timezone

Fine-Grained HEP Analysis Task Graph Optimization with Coffea and Dask

May 9, 2023, 4:30 PM
Chesapeake Meeting Room (Norfolk Waterside Marriott)

Chesapeake Meeting Room

Norfolk Waterside Marriott

235 East Main Street Norfolk, VA 23510
Oral Track 5 - Sustainable and Collaborative Software Engineering Track 5 - Sustainable and Collaborative Software Engineering


Gray, Lindsey (FNAL)


The recent release of AwkwardArray 2.0 significantly changes the way that lazy evaluation and task-graph building are handled in columnar analysis. The Dask parallel processing library is now used for these pieces of functionality with AwkwardArray, and this change affords new ways of optimizing columnar analysis and distributing it on clusters. In particular this allows optimization of a task graph all the way to the user code, possibly obviating the “processor” pattern Coffea has relied upon up to now. Utilizing this functionality completely required a major retooling of Coffea for this new infrastructure, which has resulted in a more extensible and easily maintainable codebase depending on the dask-awkward, and dask-histogram packages. We will demonstrate comparative performance benchmarks between Awkward-array 1.0 and Awkward-array 2.0 based releases of Coffea, as well as between processor-based and fully-dask-optimized compute graphs in AwkwardArray 2.0.

Consider for long presentation No

Primary authors

Presentation materials