Intelligent Data Delivery Service
If HL-LHC is going to process exabytes of data, it needs data access systems that can deliver. The intelligent Data Delivery Service (iDDS) is an attempt to make the workflow system more aware of the data workflows and get data processed more effectively. The initial use case was the “data caraousel” for ATLAS: orchestrating the processing of data as soon as it comes out of archival systems instead of waiting for entire datasets to be staged. This minimizes the use of disk buffers – especially relevant for HL-LHC as the size of the disk buffer shrinks compared to the total dataset volumes.
The IDDS work is an ongoing project within IRIS-HEP in the DOMA and Analysis Systems area, as well as within the HEP Software Foundation event delivery group.
ATLAS Data Carousel: This use case, in production since May 2020 for the ATLAS experiment, minimizes the delay between data being read from tape archive until it’s delivered to a processor.
Hyper Parameter Optimization (HPO): There’s a strong overlap between the data management needed for detector events and what is needed for management of hyperparameters in training machine learning models. iDDS has developed a backend plugin for HPO and thus provides a fully-automated platform for HPO on top of geographically distributed GPU resources on the grid, HPC, and clouds.
We have advertised using iDDS for HPO within the ATLAS community; however, its application is not limited to ATLAS. Currently, it is actively used for ATLAS workflows for FastCaloGAN and ToyMC.
DAG based workflow management: To support its data delivery functionality, iDDS internally implements a high-level workflow engine, specifying a set of interdependent jobs as a directed acyclic graph (DAG). iDDS, interacting with software such as PanDA, drives workload scheduling and implements management of job chains for multi-step processing with thousands of jobs per step.
In fact, the DAG engine can be used directly for workflow management. Using the DOMA PanDA instance, iDDS is being tested by the Rubin Observatory (formerly LSST) for their data processing needs. So far, the observatory has successfully tested iDDS with DAG workflows of over 50 and 150 thousand jobs.
- Brian Bockelman
- Wen Guan
- Tadashi Maeno
- Rui Zhang
- Tuan Minh Pham
- 12 Jul 2021 - "An intelligent Data Delivery Service (iDDS) for and beyond the ATLAS experiment", Wen Guan, 2021 Meeting of the Division of Particles and Fields of the American Physical Society (DPF21)
- 16 Jun 2021 - "iDDS", Wen Guan, ADC @ ATLAS Software & Computing Week
- 3 Jun 2021 - "update on iDDS", Wen Guan, ATLAS ADC WFMS meeting
- 18 May 2021 - "An intelligent Data Delivery Service for and beyond the ATLAS experiment", Wen Guan, 25th International Conference on Computing in High-Energy and Nuclear Physics(vCHEP2021)
- 29 Mar 2021 - "intelligent Data Delivery Service", Wen Guan, HL-LHC R&D topics
- 27 Jan 2021 - "iDDS active learning status and iDDS plans", Wen Guan, ADC @ ATLAS Software & Computing Week
- 21 Jan 2021 - "iDDS 2021", Wen Guan, ATLAS ADC WFMS meeting
- 5 Oct 2020 - "iDDS: new workflow structure", Wen Guan, ADC @ ATLAS Software & Computing Week
- 1 Oct 2020 - "iDDS news for machine learning", Wen Guan, Joint Atlas Machine Learning / Workflow Management Meeting
- 9 Jul 2020 - "iDDS for machine learning", Wen Guan, Joint Atlas Machine Learning / Workflow Management Meeting
- 15 Jun 2020 - "iDDS HyperParameter Optimization development for machine learning", Wen Guan, ATLAS Software & Computing Week
- 28 May 2020 - "iDDS integration", Wen Guan, ATLAS ADC WFMS meeting
- 11 Mar 2020 - "iDDS: A New Service with Intelligent Orchestration and Data Transformation and Delivery", Wen Guan, 3rd Rucio Community Workshop
- 27 Feb 2020 - "intelligent Data Delivery Service (iDDS) (Poster)", Wen Guan, IRIS-HEP poster session
- 7 Nov 2019 - "Event Streaming Service for ATLAS Event Processing", Wen Guan, 24th International Conference on Computing in High Energy Physics(CHEP2019)
- 30 Sep 2019 - "IDDS", Wen Guan, HSF & ATLAS Joint Event Delivery Workshop
- 24 Jun 2019 - "Delivery of columnar data to analysis systems", Marc Weinberg, ATLAS Software & Computing Week #62
- 20 Mar 2019 - "WLCG DOMA TPC Updates", Brian Bockelman, 2019 Joint HSF/OSG/WLCG Workshop (HOW2019)
- 6 Feb 2019 - "IRIS-HEP DOMA", Brian Bockelman, IRIS-HEP Steering Board Meeting
- An intelligent Data Delivery Service for and beyond the ATLAS experiment, W. Guan, T. Maeno, B. Bockelman, T. Wenaus, F. Lin, S. Padolski, R. Zhang and A. Alekseev, EPJ Web Conf. 251 02007 (2021) (28 Feb 2021).
- Towards an Intelligent Data Delivery Service, Wen Guan, Tadashi Maeno, Gancho Dimitrov, Brian Paul Bockelman, Torre Wenaus, Vakhtang Tsulaia, Nicolo Magini, CHEP2019 (14 Mar 2020).