Third Party Copy
LHC data is constantly beign moved between computing and storage sites to support analysis, processing, and simluation; this is done at a scale that is currently unique within the science community. For example, the CMS experiment on the LHC manages approximately 200PB of data and, on a daily basis, moves 1PB between sites.
Historically, this has been done with the GridFTP protocol; as we look to the increased data volumes of HL-LHC and GridFTP becomes increasingly niche, the LHC community is looking for alternate mechanisms and protocols to move data. The IRIS-HEP DOMA area - in collaboration with the WLCG DOMA activity - is investigating the use of HTTP for bulk data transfer.
Bandwidth achieved from standalone testing
The above graph shows data movement rates (up to 24Gbps) for a single host, achieved during standalone tests; a typical LHC site will load-balance across multiple hosts in order to saturate available network links.
Over the past months we have been increasingly enabling sites to support the HTTP protocol to move data between sites. Our initial goal was set to get one site to get more that 30% of its data using the HTTP protocol.
For CMS, we have picked 2 sites: Nebraska and UCSD to be the ones leading the transition by using the ‘davs’ protocol for all their incoming production transfers from the many sites which can support such protocol.
Percentage of data transfered to UCSD using GridFTP and HTTP
The above shows the amount of data transferred to UCSD using the GridFTP protocol with respect to HTTP during July 2020.
TPC Dashboards for Nebraska and UCSD can be found here:
On the ATLAS side, the transition has started to ramp up with 3 participating sites: AGLT2, PragueLCG2 and SLAC.
Protocol breakdown for transfers at: PragueLCG2 and AGLT2
The above shows the percentage of data transferred using each of the available protocols for the sites: PrageLCG2 and AGLT2 during July 2020.
Important links about the project can be found here:
- Brian Bockelman
- Diego Davila
- 3 Mar 2021 - "HTTP Third-Party Copy: Getting rid of GridFTP", Diego Davila, OSG All-Hands Meeting 2021
- 24 Feb 2021 - "Update on the adoption of WebDAV for Third Party Copy transfers", Diego Davila, Offline and Computing Weekly meeting
- 5 Aug 2020 - "Progress on transferring with HTTP-TPC", Diego Davila, Offline and Computing Weekly meeting
- 27 Feb 2020 - "Modernizing the LHC’s transfer infrastructure", Brian Bockelman, IRIS-HEP Poster Session
- 27 Feb 2020 - "Modernizing the LHC’s transfer infrastructure", Edgar Fajardo, IRIS-HEP Poster Session
- 27 Nov 2019 - "Benchmarking xrootd HTTP tests", Edgar Fajardo, WLCG DOMA General Meeting
- 5 Nov 2019 - "Third-party transfers in WLCG using HTTP", Brian Bockelman, 24th International Conference on Computing in High Energy & Nuclear Physics
- 12 Jun 2019 - "XRootD and HTTP performance studies", Edgar Fajardo, XrootD Workshop@CC-IN2P3
- 28 May 2019 - "WLCG DOMA TPC Working Group", Brian Bockelman, US CMS Tier-2 Facilities May 2019 Meeting
- 6 Feb 2019 - "IRIS-HEP DOMA", Brian Bockelman, IRIS-HEP Steering Board Meeting
- Third-party transfers in WLCG using HTTP, Brian Bockelman and Andrea Ceccanti and Fabrizio Furano and Paul Millar and Dmitry Litvintsev and Alessandra Forti, arXiv:2007.03490 [cs.DC] (Submitted to CHEP 2019) (08 Nov 2019).