Modeling Data Workflows


The physics program of the HL-LHC (2026-2040) is not that different from the physics program of Run 2 (since 2015) in some fundamental dimensions; for example, the size and physics organization of the experiment is essentially the same. What does change is the scale of the total data volume. This allows us to use current practice as a guide for modeling and “data driven design” decisions in ways that was not realistic before Run 1 data taking started.

The modeling area of DOMA is thus dedicated to deriving an understanding of current practice, an understanding of how to extrapolate current practice in a meaningful way to the HL-LHC, and ultimately the modeling of different strategies and architectures for DOMA for the HL-LHC to understand the characteristics of a cost effective and performant computing model for the HL-LHC.

Work in this area ranges from “intelligent storage” systems to “regional caching strategies” to “global replication strategies”, and the management of network and storage assets to support processing. It thus provides the analytical underpinning of other DOMA projects.


Report on cache usage on the WLCG and potential use cases and deployment scenarios for the US LHC facilities

Report on LHC data access patterns, data uses, and intelligent caching approaches for the HL-LHC (draft)