Modeling Data Workflows
The physics program of the HL-LHC (2026-2040) is not that different from the physics program of Run 2 (since 2015) in some fundamental dimensions; for example, the size and physics organization of the experiment is essentially the same. What does change is the scale of the total data volume. This allows us to use current practice as a guide for modeling and “data driven design” decisions in ways that was not realistic before Run 1 data taking started.
The modeling area of DOMA is thus dedicated to deriving an understanding of current practice, an understanding of how to extrapolate current practice in a meaningful way to the HL-LHC, and ultimately the modeling of different strategies and architectures for DOMA for the HL-LHC to understand the characteristics of a cost effective and performant computing model for the HL-LHC.
Work in this area ranges from “intelligent storage” systems to “regional caching strategies” to “global replication strategies”, and the management of network and storage assets to support processing. It thus provides the analytical underpinning of other DOMA projects.
- Frank Wuerthwein
- Diego Davila
- 15 Jul 2020 - "Brainstorming Data Lake Challenge — CMS Use Cases", Frank Wuerthwein, WLCG DOMA general meeting
- 15 Jun 2020 - "Data use by the CMS experiment at the LHC", Frank Wuerthwein, ESNet Seminar
- 17 Mar 2020 - "The ECoM2x Process", Frank Wuerthwein, Joint US ATLAS - US CMS Meeting on Facility R&D
- 27 Feb 2020 - "Measurements of Data Access", Diego Davila, IRIS-HEP Poster Session
- 5 Jul 2019 - "Reducing Disk Needs with a Data Lake Concept", Frank Wuerthwein, ECoM2x meeting CMS
- 4 Jun 2019 - "Proposal for Sharing of Data on Data Access", Diego Davila, DOMA / ACCESS Meeting
- 4 Jun 2019 - "Data Lake — A site perspective", Frank Wuerthwein, DOMA Access Meeting WLCG