Data Grand Challenge
Data Processing Grand Challenge
During a nominal year of HL-LHC data taking, ATLAS and CMS together expect to take close to one exabyte of RAW data. Both experiments intend to process each year’s worth of data as early as possible in the year after. A reasonable working assumption is thus that one exabyte of data across both experiments will have to be processed in 100 days, or roughly 10 PB/day, or 1 Tbit/sec.
The RAW data will reside on tape archives across the Tier-1s and CERN, and will get processed at Tier-1s, Tier-2s, and HPC centers. It is highly likely that the two experiments will overlap in time and at least some processing locations, e.g. the large DOE and NSF HPC centers. And it is virtually guaranteed that both will overlap on many network segments worldwide.
IRIS-HEP, together with the US LHC Operations programs, the ATLAS and CMS global collaborations, and the WLCG arrived at a series of data challenges for the next several years (2021, 2023, 2025, 2027), during which the capabilities and performance of the global infrastructure will be slowly scaled out to reach HL-LHC requirements. This includes three levels of challenges that interleave and build on each other. First, there will be functionality evaluations during which new functionality of various infrastructure software products are tested. Second are scalability challenges of such individual products, and third, global production challenges in alternate years during which the production systems are exercised at increasing scale. The first two types of challenges feed into the third as new products providing new functionality, or scale, enter the production systems over time. IRIS-HEP is engaged in these challenges at all levels via projects in multiple of its focus areas.
Join us
We collaborate with groups around the world on code, data, and more. See our project pages for more.