Data Organization, Management and Access (DOMA)
The HL-LHC era will provide enormous challenges in the area of Data Organization, Management and Access (DOMA). The LHC will provide a significantly increased number of events and increased event complexity, both of which will drive much larger data sizes - with no changes in how the LHC community functions, the total increase in data volume may be a factor of 30.
Given the LHC experiments are, combined, managing nearly an exabyte of data, such a significant increase in volume is unmanageable. New mechanisms and techniques are necessary to more efficiently manage storage resources and deliver data to processing endpoints; the DOMA area in IRIS-HEP is working on the R&D necessary to affect such change.
The bulk data transfer technologies were designed almost 15 years ago; the DOMA team is taking a fresh look at the transfer protocols and the authentication and authorization infrastructure used by the LHC community. This is resulting in a worldwide transition to new protocols and authorization approaches. The first phase of this work is done: the LHC community has successfully transitioned to HTTP as a foundation for bulk data transfers.
It is not only data volumes that are potentially disruptive to the HL-LHC physics program; the extraordinarily large number of events (potentially 150 billion similated and recorded events per year per experiment) presents a challenge in data management for users. Along with the analysis systems team within IRIS, DOMA is working on improved techniques for delivering events to users. Not only is the team researching new approaches for data delivery and implementing services but also working to integrate them together with into a coherent analysis facility, Coffea-Casa, for users.
Contact us: doma-team@iris-hep.org
Current and Previous DOMA Fellows
DOMA Presentations
- 30 Oct 2024 - "The current status of the Rucio/SENSE integration project", Aashay Arora, CMS O&C Week
- 24 Oct 2024 - "Tuning the CMS Coffea-casa facility for 200 Gbps Challenge", Oksana Shadura, Conference on Computing in High Energy and Nuclear Physics (CHEP 2024)
- 23 Oct 2024 - "Benchmarking XRootD-HTTPS on 400Gbps Links with Variable Latencies", Aashay Arora, CHEP'24
- 22 Oct 2024 - "CMS Token Transition", Brian Bockelman, CHEP 2024 - Conference on Computing in High Energy and Nuclear Physics
- 22 Oct 2024 - "Data Movement Manager (DMM) for the SENSE-Rucio Interoperation Prototype", Aashay Arora, CHEP'24
- 21 Oct 2024 - "The 200 Gbps Challenge: Imagining HL-LHC analysis facilities", Alexander Held, CHEP 2024
- 15 Oct 2024 - "Rucio-SENSE Interoperation Prototype", Aashay Arora, CMS Rucio Meeting
- 2 Oct 2024 - "Integration between Rucio and SENSE", Diego Davila, 7th Rucio Community Workshop
- 13 Sep 2024 - "Network isolation for multi-IP exposure in XRootD", Diego Davila, XRootD and FTS Workshop
- 5 Sep 2024 - "Rucio/SENSE in SSL", Diego Davila, IRIS-HEP Institute Retreat
- 4 Sep 2024 - "AGC & IDAP / 200 Gbps", Oksana Shadura, IRIS-HEP Institute Retreat 2024
- 2 Sep 2024 - "Facilities R&D HSF highlights", Oksana Shadura, The 8th Asian Tier Center Forum
- 18 Jun 2024 - "Analysis Grand Challenge", Oksana Shadura, Analysis Facilities Workshop
- 11 Jun 2024 - "Coffea-casa and 200 Gbps challenge - experience with Kubernetes", Oksana Shadura, 2024 All-Hands Workshop of the U.S. CMS Software and Computing Operations Program
- 9 Jun 2024 - "The 200 Gbps Challenge at Nebraska", Oksana Shadura, US CMS Analysis Facility Meeting
- 16 May 2024 - "IRIS-HEP 200Gbps challenge", Brian Bockelman, WLCG/HSF Workshop 2024
- 15 May 2024 - "Cloud Data Lake Technologies", Ben Galewsky, WLCG/HSF Workshop
- 8 May 2024 - "Notes about AF users UX feedback", Oksana Shadura, Common Analysis Tools (CAT) general meeting
- 17 Apr 2024 - "Research Networking Technical Working Group Status and Plans", Shawn McKee, HEPiX Spring 2024 meeting
- 11 Apr 2024 - "Data Movement Manager for the Rucio-SENSE Interoperation Prototype", Aashay Arora, Rucio Meeting
- 27 Mar 2024 - "Rucio/SENSE during DC24", Diego Davila, S&C Blueprint Meeting
- 27 Mar 2024 - "DC24: Token Achievements", Brian Bockelman, S&C Blueprint Meeting - DC24 outcomes
- 19 Mar 2024 - "View from HSF - HSF AF White Paper Overview", Oksana Shadura, CMS Spring 2024 Offline and Computing Week
- 15 Mar 2024 - "Introduction - IRIS-HEP Data Analysis Pipeline (IDAP)", Oksana Shadura, IRIS-HEP Data Analysis Pipeline (IDAP) meeting
- 15 Mar 2024 - "Analysis running at 200Gbps as part of the Analysis Grand Challenge", Brian Bockelman, IRIS-HEP Data Analysis Pipeline (IDAP) meeting
- 11 Mar 2024 - "ServiceX, the novel data delivery system, for physics analysis", KyungEon Choi, ACAT 2024
- 11 Mar 2024 - "From Amsterdam to ACAT 2024: The Evolution and Convergence of Declarative Analysis Language Tools and Imperative Analysis Tools", Gordon Watts, ACAT 2024
- 5 Mar 2024 - "Analysis Grand Challenge (AGC)", Oksana Shadura, US CMS Analysis Facility Meeting
- 10 Jan 2024 - "AGC Deep Dive", Oksana Shadura, NSF / IRIS-HEP Meeting (January 2024)
- 5 Dec 2023 - "Updates on Coffea-Casa AF", Oksana Shadura, Common Analysis Tools (CAT) general meeting
- 14 Nov 2023 - "400Gbps benchmark of XRootD HTTP-TPC", Aashay Arora, SC'23
- 9 Nov 2023 - "perfSONAR Plans for DC24", Shawn McKee, Data Challenge 2024 Workshop
- 9 Nov 2023 - "Rucio/SENSE in DC24", Diego Davila, Data Challenge 2024 Workshop
- 11 Oct 2023 - "Rucio/SENSE overview and our plans for DC24", Diego Davila, USCMS S&C Blueprint Meeting - DC24 preparation
- 4 Oct 2023 - "The AGC with ATLAS Data", Gordon Watts, The ATLAS Software And Computing Week #76 (internal)
- 14 Sep 2023 - "Towards the HL-LHC-scale I/O: DOMA vision talk", Brian Bockelman, IRIS-HEP AGC Demonstration 2023
- 14 Sep 2023 - "AGC Team End-to-End Demo", Alexander Held, IRIS-HEP AGC Demonstration 2023
- 12 Sep 2023 - "Future Analysis Facilities R&D", Oksana Shadura, IRIS-HEP Institute Retreat
- 12 Sep 2023 - "Current Plans for AGC", Oksana Shadura, IRIS-HEP Institute Retreat
- 12 Sep 2023 - "Rucio/SENSE overview and our plans for DC24", Diego Davila, IRIS-HEP Institute Retreat
- 12 Sep 2023 - "Preparing for DC24 - technology targets for DOMA", Brian Bockelman, IRIS-HEP Institute Retreat
- 11 Sep 2023 - "Focus Area - Analysis Grand Challenge", Oksana Shadura, IRIS-HEP Institute Retreat
- 11 Sep 2023 - "DOMA Focus Area Talk", Brian Bockelman, IRIS-HEP Institute Retreat
- 11 Sep 2023 - "DOMA Focus Area Talk", Brian Bockelman, IRIS-HEP Institute Retreat
- 27 Jul 2023 - "Analysis Grand Challenge & Coffea-Casa analysis facility as a test environment for packages and services", Oksana Shadura, PyHEP.dev 2023 - "Python in HEP" Developer's Workshop
- 24 Jul 2023 - "Analysis Grand Challenge Demo - Hands-on Demo Session", Oksana Shadura, Computational HEP Traineeship Summer School
- 12 Jul 2023 - "Rucio/SENSE implementation for CMS", Diego Davila, Throughput Computing 2023
- 11 Jul 2023 - "Analysis Grand Challenge", Oksana Shadura, IRIS-HEP / Ops Program Analysis Grand Challenge Planning
- 14 Jun 2023 - "WLCG Data Challenge 2024 (DC24) Status and Plans Related to ATLAS DDM", Shawn McKee, ATLAS Software and Computing #75
- 5 Jun 2023 - "USATLAS Facility and Data Challenge 2024", Shawn McKee, USATLAS Technical Meeting June 2023
- 24 May 2023 - "Coffea-casa analysis facility", Oksana Shadura, Common Analysis Tools (CAT) general meeting (CMS internal)
- 11 May 2023 - "Data Management Package for the novel data delivery system, ServiceX, and Applications to various physics analysis workflows", KyungEon Choi, CHEP 2023 Conference
- 11 May 2023 - "400Gbps benchmark of XRootD HTTP-TPC", Aashay Arora, CHEP'23
- 9 May 2023 - "Coffea-Casa - Building composable analysis facilities for the HL-LHC", Oksana Shadura, 26th International Conference on Computing in High Energy & Nuclear Physics
- 9 May 2023 - "Managing the OSG Fabric of Services the GitOps Way", Brian Bockelman, CHEP 2023 Conference
- 9 May 2023 - "Physics analysis for the HL-LHC: concepts and pipelines in practice with the Analysis Grand Challenge", Alexander Held, CHEP 2023
- 6 May 2023 - "Analysis Facilities AAI", Brian Bockelman, WLCG-HSF Pre-CHEP Workshop 2023
- 6 May 2023 - "Analysis Facilities AAI", Brian Bockelman, WLCG-HSF Pre-CHEP Workshop 2023
- 5 May 2023 - "DOMA: workshop outcomes and action items", Brian Bockelman, IRIS-HEP AGC Workshop 2023
- 5 May 2023 - "DOMA: workshop outcomes and action items", Brian Bockelman, IRIS-HEP AGC workshop 2023
- 5 May 2023 - "Analysis Grand Challenge workshop closing", Alexander Held, IRIS-HEP Analysis Grand Challenge workshop 2023
- 3 May 2023 - "Analysis Grand Challenge workshop introduction", Alexander Held, IRIS-HEP Analysis Grand Challenge workshop 2023
- 23 Mar 2023 - "Coffea-casa analysis facility", Oksana Shadura, International Symposium on Grids & Clouds (ISGC) 2023 in conjunction with HEPiX Spring 2023 Workshop
- 23 Mar 2023 - "Scaling transfers for HL-LHC, DC24 plan", Diego Davila, WLCG DOMA General Meeting
- 23 Mar 2023 - "Physics analysis workflows and pipelines for the HL-LHC", Alexander Held, International Symposium on Grids & Clouds (ISGC) 2023
- 17 Mar 2023 - "Overview - Token Transition", Brian Bockelman, CMS Spring 2023 O&C Week
- 14 Mar 2023 - "Analysis Grand Challenge updates", Alexander Held, IRIS-HEP / Ops Program Analysis Grand Challenge Planning
- 5 Mar 2023 - "CMS O&C CDR - Evolution of Analysis Facilities", Oksana Shadura, CMS O&C Upgrade R&D on Workflow Management and Computing Infrastructure
- 24 Jan 2023 - "Analysis Grand Challenge updates", Alexander Held, IRIS-HEP / Ops Program Analysis Grand Challenge Planning
- 7 Dec 2022 - "BDT: Status update", Brian Bockelman, WLCG DOMA General Meeting - December 2022
- 2 Dec 2022 - "Open Source Program Offices in Research Universities", Carlos Maltzahn, CHPC National Conference 2022
- 18 Nov 2022 - "New Computing and Software Frontiers in Particle Physics", Gordon Watts, XIV Latin American Symposium on High Energy Physics
- 15 Nov 2022 - "Analysis Grand Challenge updates", Oksana Shadura, IRIS-HEP / Ops Program Analysis Grand Challenge Planning
- 11 Nov 2022 - "Automated Network Services for Exascale Data Movement", Diego Davila, 5th Rucio Community Workshop
- 7 Nov 2022 - "Network Management Enhancements for the High Luminosity Era", Diego Davila, WLCG Workshop 2022
- 3 Nov 2022 - "IRIS-HEP AGC update", Alexander Held, HSF Analysis Facilities Forum
- 25 Oct 2022 - "First performance measurements with the Analysis Grand Challenge", Oksana Shadura, ACAT 2022
- 12 Oct 2022 - "AGC - Perspective from focus areas and projects", Oksana Shadura, IRIS-HEP Institute Retreat
- 12 Oct 2022 - "Data Grand Challenge", Brian Bockelman, IRIS-HEP Institute Retreat - October 2022
- 12 Oct 2022 - "AGC overview, status and plans", Alexander Held, IRIS-HEP Institute Retreat
- 4 Oct 2022 - "Analysis Grand Challenge updates", Oksana Shadura, IRIS-HEP / Ops Program Analysis Grand Challenge Planning
- 28 Sep 2022 - "Birds of a Feather: Pathways to Enable an Open Source Ecosystem for the Skyhook Project", Carlos Maltzahn, 2022 UC Santa Cruz Open Source Symposium
- 27 Sep 2022 - "Welcome & Introductions", Carlos Maltzahn, 2022 UC Santa Cruz Open Source Symposium
- 15 Sep 2022 - "End-to-end physics analysis with Open Data: the Analysis Grand Challenge", Alexander Held, PyHEP 2022 (virtual) Workshop
- 14 Sep 2022 - "Data Management Package for the novel data delivery system, ServiceX, and its application to an ATLAS Run-2 Physics Analysis Workflow", KyungEon Choi, PyHEP 2022 Workshop
- 31 Aug 2022 - "Managed Network Services for Exascale Data Movement Across Large Global Scientific Collaborations", Diego Davila, WLCG DOMA General Meeting
- 23 Aug 2022 - "Analysis Grand Challenge updates", Alexander Held, IRIS-HEP / Ops Program Analysis Grand Challenge Planning
- 16 Jul 2022 - "Analysis Facilities", Oksana Shadura, Seattle Snowmass Summer Meeting 2022
- 12 Jul 2022 - "Report from Analysis Ecosystems II Workshop", Oksana Shadura, Software & Computing Round Table (2022)
- 9 Jul 2022 - "The IRIS-HEP Analysis Grand Challenge", Alexander Held, ICHEP 2022
- 29 Jun 2022 - "Modern Python analysis ecosystem for High Energy Physics" , Matthew Feickert, The Python Exchange for DOE Employees (DOEPy)
- 23 Jun 2022 - "HPC Panel at the Data Thread", Carlos Maltzahn, The Data Thread Conference
- 14 Jun 2022 - "Analysis Grand Challenge updates", Alexander Held, IRIS-HEP / Ops Program Analysis Grand Challenge Planning
- 13 Jun 2022 - "Skyhook Blueprint for Computational Storage, a Case for Amplifying Research Impact via Open Source", Carlos Maltzahn, Research and Practice Colloquium at Friedrich-Alexander-Universität, Erlangen-Nürnberg, Germany
- 13 Jun 2022 - "IRIS-HEP Analysis Grand Challenge: Status & plans", Alexander Held, ATLAS Software & Computing Week
- 25 May 2022 - "Analysis Facilities - Summary", Oksana Shadura, Analysis Ecosystem Workshop II
- 17 May 2022 - "Skyhook: Towards an Arrow-Native Storage System", Jayjeet Chakraborty, CCGrid22
- 3 May 2022 - "Next steps for coffea-casa AF", Oksana Shadura, https://docs.google.com/presentation/d/1rINHkaqozp-RD9Bu-fWZtbQ1ktocw8d5fQ1sFMRjW7k/edit?usp=sharing
- 3 May 2022 - "AGC at US CMS AFs", Carl Lundstedt, IRIS-HEP Analysis Grand Challenge workshop 2023
- 26 Apr 2022 - "Research Networking Technical WG Status and Plans", Shawn McKee, Spring 2022 HEPiX Meeting
- 25 Apr 2022 - "IRIS-HEP Analysis Grand Challenge Tools Workshop", Oksana Shadura, RIS-HEP AGC Tools 2022 Workshop
- 25 Apr 2022 - "Scale-out with coffea: coffea-casa analysis facility", Carl Lundstedt, IRIS-HEP AGC Tools (April) 2022 Workshop
- 25 Apr 2022 - "From data delivery to statistical inference with CMS Open Data", Alexander Held, IRIS-HEP AGC Tools 2022 Workshop
- 8 Apr 2022 - "CompF4 Analysis Facility - Discussion & Priorities", Oksana Shadura, Snowmass CompF4 Topical Group Workshop
- 8 Apr 2022 - "Authorization and Authentication Evolution", Brian Bockelman, Snowmass CompF4 Topical Group Workshop
- 6 Apr 2022 - "Creating an OSPO at the University of California", Carlos Maltzahn, OSPOlogy: How Academic OSPOs are Amplifying Research Impact
- 5 Apr 2022 - "Analysis Grand Challenge updates", Oksana Shadura, IRIS-HEP / Ops Program Analysis Grand Challenge Planning
- 1 Apr 2022 - "Report about HSF Analysis Facilities Forum Kick-off meeting", Oksana Shadura, CMS Spring 2022 O&C Week
- 25 Mar 2022 - "Faculty and Student Session: Open Source Research Experience (OSRE)", Carlos Maltzahn, The Association of Computer Science Departments at Minority Institutions (ADMI) High Performance Computing and Gateways 2022 Symposium (ADMI 2022)
- 25 Mar 2022 - "AFs in the context of the IRIS-HEP AGC", Alexander Held, Analysis Facilities Forum Kick-off Meeting
- 17 Mar 2022 - "OSG 3.5 EOL Update", Brian Bockelman, OSG Council Meeting - March 2022
- 16 Mar 2022 - "Updates on the OSDF Monitoring System", Derek Weitzel, OSG All-Hands Meeting 2022
- 3 Mar 2022 - "Monte Carlo Toy Based Confidence Limits with iDDS (presented by Christian Weber)", Wen Guan, ADC WFMS weekly meeting
- 3 Mar 2022 - "REANA / PanDA integration for active learning", Wen Guan, ADC WFMS weekly meeting
- 2 Mar 2022 - "Scientific Network Tags Packet and Flow Marking", Shawn McKee, WLCG DOMA Bulk Data Transfer (BDT) WG
- 1 Mar 2022 - "Analysis Grand Challenge updates", Alexander Held, IRIS-HEP / Ops Program Analysis Grand Challenge Planning
- 2 Feb 2022 - "Creating an OSPO at the University of California", Carlos Maltzahn, OSPO++ Academic Track: UC Santa Cruz and UVM on setting up an OSPO
- 28 Jan 2022 - "Analysis Grand Challenge updates", Oksana Shadura, IRIS-HEP / Ops Program Analysis Grand Challenge Planning
- 27 Jan 2022 - "Packet and Flow Marking Technical Specification Update", Shawn McKee, Research Networking Technical Working Group Meeting
- 27 Jan 2022 - "XRootD Monitoring Flow", Derek Weitzel, WLCG Operations Coordination
- 10 Jan 2022 - "An intelligent Data Delivery Service (iDDS) for and beyond the ATLAS experiment", Wen Guan, 30th International Symposium on Lepton Photon Interactions at High Energies
- 16 Dec 2021 - "Analysis Grand Challenge updates", Oksana Shadura, IRIS-HEP Executive Board / Ops Program Grand Challenge Discussion
- 14 Dec 2021 - "XRootD Shoveler", Derek Weitzel, Tier-2 Facilities Meeting
- 30 Nov 2021 - "ServiceX - Making everything Columnar", Gordon Watts, ACAT 2021
- 30 Nov 2021 - "DOMA Update for Steering Board #12", Brian Bockelman, IRIS-HEP Steering Board Meeting #12
- 30 Nov 2021 - "Analysis Grand Challenge updates", Alexander Held, Steering Board Meeting
- 25 Nov 2021 - "Ceph Code Walkthroughs-CoDel for BlueStore", Esmaeil Mirvakili, Ceph Code Walkthroughs
- 22 Nov 2021 - "Coffea-casa news and developments", Oksana Shadura, Coffea Users Meeting
- 19 Nov 2021 - "Panel: Intersection of Large-Scale Experimental Science and High-Performance Computing", Mark Neubauer, Workshop on Extreme-Scale Experiment-in-the-Loop Computing (XLOOP 2021)
- 17 Nov 2021 - "Deep Dive - Analysis Grand Challenge", Oksana Shadura, NSF / IRIS-HEP Meeting (November 2021)
- 9 Nov 2021 - "Analysis Facilities", Oksana Shadura, CMS Operations & Computing R&D meeting
- 4 Nov 2021 - "Scale-out with coffea", Oksana Shadura, IRIS-HEP AGC Tools 2021 Workshop
- 4 Nov 2021 - "Skyhook Data Management", Carlos Maltzahn, Analysis Grand Challenge Tools 2021 Workshop
- 3 Nov 2021 - "From data delivery to statistical inference: ServiceX, coffea, cabinetry & pyhf", Alexander Held, IRIS-HEP AGC Tools 2021 Workshop
- 2 Nov 2021 - "Analysis Grand Challenge", Oksana Shadura, SwiftHep/ExcaliburHep workshop
- 2 Nov 2021 - "Towards an OSPO at the University of California", Carlos Maltzahn, Linux Foundation Membership Summit
- 26 Oct 2021 - "Ceph", Carlos Maltzahn, DPS Guest Lecture at LIACS, Leiden University
- 22 Oct 2021 - "Migration to WebDAV", Diego Davila, Fall21 Offline Software and Computing Week
- 12 Oct 2021 - "FABRIC and FAB Project Overviews and Status", Shawn McKee, Fall 2021 LHCOPN/LHCONE Meeting
- 11 Oct 2021 - "Research Network Technical WG Update", Shawn McKee, Fall 2021 LHCOPN/LHCONE Meeting
- 29 Sep 2021 - "SkyhookDM: An Arrow-Native Storage System", Jayjeet Chakraborty, SNIA 2021 Presentation
- 24 Sep 2021 - "Coffea-casa - an analysis facility prototype", Oksana Shadura, Joint AMG and WFMS Meeting on Analysis Facilities
- 19 Aug 2021 - "Virtual Meetings Workshop Summary", Mark Neubauer, International Union of Pure & Applied Physics: Particles and Fields Meeting
- 20 Jul 2021 - "Using Microsoft Azure for XRootD network benchmarking", Aashay Arora, PEARC'21 Poster Session
- 12 Jul 2021 - "An intelligent Data Delivery Service (iDDS) for and beyond the ATLAS experiment", Wen Guan, 2021 Meeting of the Division of Particles and Fields of the American Physical Society (DPF21)
- 9 Jul 2021 - "Using Python, coffea, and ServiceX to Rediscover the Higgs. Twice." , Gordon Watts, PyHEP 2021
- 30 Jun 2021 - "SkyhookDM: Towards an Arrow-Native Storage System", Jayjeet Chakraborty, IRIS-HEP Winter 2021 Fellowship Presentation
- 23 Jun 2021 - "OSG Xrootd Monitoring", Diego Davila, WLCG - xrootd monitoring discussion
- 16 Jun 2021 - "iDDS", Wen Guan, ADC @ ATLAS Software & Computing Week
- 8 Jun 2021 - "DOMA Update for Steering Board #10", Brian Bockelman, IRIS-HEP Steering Board Meeting #10
- 3 Jun 2021 - "update on iDDS", Wen Guan, ATLAS ADC WFMS meeting
- 27 May 2021 - "DOMA and Data Challenge", Brian Bockelman, IRIS-HEP 30 Month Review
- 21 May 2021 - "Dask in High-Energy Physics community (workshop)", Oksana Shadura, Dask Distributed Summit 2021
- 21 May 2021 - "Dask at U.S.CMS analysis facilities", Carl Lundstedt, Dask Distributed Summit 2021, Dask in High Energy Physics Community, Tutorials and Workshops
- 20 May 2021 - "Coffea-casa an analysis facility prototype (plenary)", Oksana Shadura, 25th International Conference on Computing in High-Energy and Nuclear Physics
- 19 May 2021 - "Challenges Designing Interactive Analysis Facilities with Dask", Oksana Shadura, Dask Distributed Summit 2021
- 19 May 2021 - "Towards Real-World Applications of ServiceX, an Analysis Data Transformation System", KyungEon Choi, CHEP 2021 Conference
- 18 May 2021 - "An intelligent Data Delivery Service for and beyond the ATLAS experiment", Wen Guan, 25th International Conference on Computing in High-Energy and Nuclear Physics(vCHEP2021)
- 18 May 2021 - "Systematic benchmarking of HTTPS third party copy on 100Gbps links using XRootD", Aashay Arora, vCHEP'21
- 12 May 2021 - "Transferring at 500Gbps with XRootD", Diego Davila, S&C Blueprint Meeting - Data Challenge
- 6 May 2021 - "Day 2 Topics and Goals", Mark Neubauer, Virtual Meetings Blueprint Workshop
- 5 May 2021 - "Welcome, IRIS-HEP Blueprint Activity & Workshop Overview", Mark Neubauer, Virtual Meetings Blueprint Workshop
- 27 Apr 2021 - "DOMA Update – IRIS-HEP Exec Board Meeting", Brian Bockelman, IRIS-HEP PI/EB Meeting
- 19 Apr 2021 - "ServiceX On-Demand Data Transformation and Delivery, and Applications", KyungEon Choi, APS 2021
- 29 Mar 2021 - "intelligent Data Delivery Service", Wen Guan, HL-LHC R&D topics
- 9 Mar 2021 - "Open Source Research Experience -- Summer 2021", Carlos Maltzahn, Launch of 2021 OSRE Program
- 3 Mar 2021 - "Moving science data: One CDN to rule them all", Derek Weitzel, OSG All Hands Meeting 2021
- 3 Mar 2021 - "HTTP Third-Party Copy: Getting rid of GridFTP", Diego Davila, OSG All-Hands Meeting 2021
- 25 Feb 2021 - "GeoIP HTTPS Redirector", Edgar Fajardo, XCache DevOps Meeting
- 24 Feb 2021 - "Update on the adoption of WebDAV for Third Party Copy transfers", Diego Davila, Offline and Computing Weekly meeting
- 3 Feb 2021 - "Future analysis facilities", Oksana Shadura, CMS Week
- 27 Jan 2021 - "iDDS active learning status and iDDS plans", Wen Guan, ADC @ ATLAS Software & Computing Week
- 21 Jan 2021 - "iDDS 2021", Wen Guan, ATLAS ADC WFMS meeting
- 15 Dec 2020 - "Summary of Future Analysis Systems and Facilities Workshop", Mark Neubauer, Workflow Management System Software Technical Interchange Meeting
- 30 Nov 2020 - "Managing Bufferbloat in Storage Systems", Carlos Maltzahn, Centre for High Performance Computing 2020 National Conference, online, South Africa
- 24 Nov 2020 - "U.S. CMS Managed Analysis Facilities", Oksana Shadura, HSF WLCG Virtual Workshop
- 24 Nov 2020 - "DOMA Update for Steering Board #8", Brian Bockelman, IRIS-HEP Steering Board Meeting
- 12 Nov 2020 - "Data-intensive IceCube Cloud Burst", Igor Sfiligoi, NRP Pilot weekly meeting
- 27 Oct 2020 - "Analysis on LHC-Managed Facilities: Coffea-Casa", Oksana Shadura, IRIS-HEP Future Analysis Systems and Facilities Blueprint Workshop
- 26 Oct 2020 - "Welcome, IRIS-HEP Blueprint Activity and Workshop Overview", Mark Neubauer, Future Analysis Systems and Facilities Blueprint Workshop
- 26 Oct 2020 - "Data Selection & Delivery for Analysis: ServiceX & funcADL", Mason Proffitt, Future Analysis Systems and Facilities
- 21 Oct 2020 - "Benchmarking TPC Transfers on 100G links", Edgar Fajardo, DOMA / TPC Meeting
- 16 Oct 2020 - "ServiceX Front End Status", Gordon Watts, ServiceX Meeting
- 5 Oct 2020 - "iDDS: new workflow structure", Wen Guan, ADC @ ATLAS Software & Computing Week
- 2 Oct 2020 - "ServiceX Front End Status", Gordon Watts, ServiceX Meeting
- 1 Oct 2020 - "iDDS news for machine learning", Wen Guan, Joint Atlas Machine Learning / Workflow Management Meeting
- 23 Sep 2020 - "Analysis facilities", Oksana Shadura, Upgrade R&D/CMP Meeting (Presented on Weekly CMS O&C Meeting slot)
- 21 Sep 2020 - "Data Engineering at the Large Hadron Collider", Ben Galewsky, Chicago Cloud Conference
- 15 Sep 2020 - "Data lake prototyping for US CMS", Edgar Fajardo, DOMA / ACCESS Meeting
- 4 Sep 2020 - "A US Data Lake", Edgar Fajardo, OSG All Hands Meeting (US ATLAS/CMS Combined session)
- 2 Sep 2020 - "Stashcache: CDN for Science", Edgar Fajardo, OSG All Hands Meeting
- 11 Aug 2020 - "Summary of CompF4", Frank Wuerthwein, Snowmass computational frontier workshop
- 10 Aug 2020 - "Discussion of CompF4 Group Mandate", Frank Wuerthwein, Snowmass computational frontier workshop
- 5 Aug 2020 - "Progress on transferring with HTTP-TPC", Diego Davila, Offline and Computing Weekly meeting
- 2 Aug 2020 - "The value of open source to universities: UC Santa Cruz tests the water", Carlos Maltzahn, Interview for a Linux Professional Institute Blog Post by Andy Oram
- 28 Jul 2020 - "Demonstrating 100 Gbps in and out of the public Clouds", Igor Sfiligoi, PEARC20
- 21 Jul 2020 - "Controlling AWS Costs Using a Data Carousel", Ben Galewsky, Earth Science Information Partners Summer Meeting
- 15 Jul 2020 - "Brainstorming Data Lake Challenge — CMS Use Cases", Frank Wuerthwein, WLCG DOMA general meeting
- 14 Jul 2020 - "ServiceX On-Demand Data Transformation and Delivery for the Present and HL-LHC Era", KyungEon Choi, PyHEP 2020 Workshop
- 9 Jul 2020 - "iDDS for machine learning", Wen Guan, Joint Atlas Machine Learning / Workflow Management Meeting
- 8 Jul 2020 - "ServiceX and Kubernetes", Ben Galewsky, LHCG Grid Deployment Board meeting
- 30 Jun 2020 - "The Ceph Project", Carlos Maltzahn, UC Berkeley Cloud Meetup 015
- 24 Jun 2020 - "Proposed New Scope of DOMA Access", Frank Wuerthwein, WLCG DOMA general meeting
- 15 Jun 2020 - "iDDS HyperParameter Optimization development for machine learning", Wen Guan, ATLAS Software & Computing Week
- 15 Jun 2020 - "Data use by the CMS experiment at the LHC", Frank Wuerthwein, ESNet Seminar
- 11 Jun 2020 - "Some lessons learned from creating and using the Ceph open source storage system", Carlos Maltzahn, BCS Open Source Specialist Group: Open source softgware for scientific and parallel computing
- 5 Jun 2020 - "How $2 Million Dollars Helped Build CROSS with Dr. Carlos Maltzahn", Carlos Maltzahn, Sustain Podcast
- 29 May 2020 - "Future Blueprint Topics: Year 3 Plans, 4 and 5 Year Thoughts", Mark Neubauer, IRIS-HEP Team Retreat
- 29 May 2020 - "DOMA: Year 3 Plans, Year 4 & 5 Goals", Brian Bockelman, IRIS-HEP Team Retreat
- 28 May 2020 - "iDDS integration", Wen Guan, ATLAS ADC WFMS meeting
- 28 May 2020 - "Idea for a Production Data Challenge", Frank Wuerthwein, IRIS-HEP Team Retreat
- 26 May 2020 - "Skyhook Data Management: programmable object storage for databases", Jeff LeFevre, Fujitsu Labs
- 26 May 2020 - "Parallel Sessions: Plans and Goals", Gordon Watts, IRIS-HEP Team Retreat
- 26 May 2020 - "DOMA - Years 3, and 4+5", Brian Bockelman, IRIS-HEP Team Retreat
- 15 May 2020 - "Industry-supported seeding of developer communities around university research prototypes", Carlos Maltzahn, OpenDP Community Meeting
- 5 May 2020 - "Idea for a production focused Data Challenge", Frank Wuerthwein, DOMA Access Meeting WLCG
- 30 Apr 2020 - "GPU Cloud Bursting for Multi-Messenger Astrophysics with IceCube", Frank Wuerthwein, AWS Education: Research Seminar Series
- 23 Apr 2020 - "How CMS user jobs use the caches", Edgar Fajardo, XCache DevOps SPECIAL
- 23 Apr 2020 - "OSG's use of XCache", Derek Weitzel, XCache Meeting
- 22 Apr 2020 - "XRootD Validation Plan", Derek Weitzel, USCMS S&C Blueprint Meeting
- 23 Mar 2020 - "ServiceX: A distributed, caching, columnar data delivery service", Marc Weinberg, HSF DAWG -- DOMA Access joint meeting
- 17 Mar 2020 - "The ECoM2x Process", Frank Wuerthwein, Joint US ATLAS - US CMS Meeting on Facility R&D
- 11 Mar 2020 - "iDDS: A New Service with Intelligent Orchestration and Data Transformation and Delivery", Wen Guan, 3rd Rucio Community Workshop
- 9 Mar 2020 - "IRIS-HEP and DOMA related activities", Brian Bockelman, WLCG DOMA F2F @ FNAL
- 9 Mar 2020 - "TPC: Status, Plans, Contribution to the HL-LHC review document", Brian Bockelman, WLCG DOMA F2F @ FNAL
- 27 Feb 2020 - "intelligent Data Delivery Service (iDDS) (Poster)", Wen Guan, IRIS-HEP poster session
- 27 Feb 2020 - "ServiceX", Marc Weinberg, IRIS-HEP Poster Session
- 27 Feb 2020 - "XCache", Edgar Fajardo, IRIS-HEP Poster Session
- 27 Feb 2020 - "Modernizing the LHC’s transfer infrastructure", Edgar Fajardo, IRIS-HEP Poster Session
- 27 Feb 2020 - "Measurements of Data Access", Diego Davila, IRIS-HEP Poster Session
- 27 Feb 2020 - "SkyhookDM: Programmable Storage for Datasets", Carlos Maltzahn, IRIS-HEP Poster Session
- 27 Feb 2020 - "Modernizing the LHC’s transfer infrastructure", Brian Bockelman, IRIS-HEP Poster Session
- 24 Feb 2020 - "Scaling databases and file apis with programmable ceph object storage", Carlos Maltzahn, 2020 Linux Storage and Filesystems Conference (Vault’20, co-located with FAST’20 and NSDI’20)
- 18 Feb 2020 - "DOMA Update for February 2020 Steering Board", Brian Bockelman, IRIS-HEP Steering Board Meeting
- 27 Nov 2019 - "Benchmarking xrootd HTTP tests", Edgar Fajardo, WLCG DOMA General Meeting
- 19 Nov 2019 - "Burst data retrieval after 50k GPU Cloud run", Igor Sfiligoi, SC 19 Internet2 Booth
- 19 Nov 2019 - "Panel presentation on Enabling Data Services for HPC", Carlos Maltzahn, Enabling Data Services for HPC (BoF at SC19)
- 7 Nov 2019 - "Event Streaming Service for ATLAS Event Processing", Wen Guan, 24th International Conference on Computing in High Energy Physics(CHEP2019)
- 5 Nov 2019 - "Mapping datasets to object storage", Xiaowei (Aaron) Chu, CHEP 2019
- 5 Nov 2019 - "A distributed R&D storage platform implementing quality of service", Shawn McKee, CHEP2019
- 5 Nov 2019 - "Creating a content delivery network for general science on the backbone of the Internet using xcaches.", Igor Sfiligoi, CHEP 2019
- 5 Nov 2019 - "Mapping datasets to object storage", Jeff LeFevre, CHEP 2019
- 5 Nov 2019 - "Creating a content delivery network for general science on the backbone of the Internet using xcaches.", Edgar Fajardo, CHEP 2019
- 5 Nov 2019 - "Moving the California distributed CMS xcache from bare metal into containers using Kubernetes", Edgar Fajardo, CHEP 2019
- 5 Nov 2019 - "Third-party transfers in WLCG using HTTP", Brian Bockelman, 24th International Conference on Computing in High Energy & Nuclear Physics
- 4 Nov 2019 - "Characterizing network paths in and out of the Clouds", Igor Sfiligoi, CHEP 2019
- 4 Nov 2019 - "A Distributed, Caching, Columnar Data Delivery Service(X)", Ben Galewsky, CHEP 2019 Conference
- 24 Oct 2019 - "Education, research, and technology transfer in open source software: new possibilities for universities", Carlos Maltzahn, École Polytechnique Fédérale de Lausanne (EPFL)
- 21 Oct 2019 - "Education, research, and technology transfer in open source software: new possibilities for universities", Carlos Maltzahn, Friedrich-Alexander Universität, Erlangen-Nürnberg
- 19 Oct 2019 - "Center for Research in Open Source Software", Carlos Maltzahn, Google Summer of Code Mentor Summit
- 3 Oct 2019 - "Mapping datasets to object storage", Xiaowei (Aaron) Chu, CROSS Research Symposium 2019
- 3 Oct 2019 - "Skyhook Data Management: Scaling Databases and Applications with Open Source Extensible Storage", Jeff LeFevre, CROSS Research Symposium 2019
- 30 Sep 2019 - "IDDS", Wen Guan, HSF & ATLAS Joint Event Delivery Workshop
- 30 Sep 2019 - "HSF Event Delivery WG - Introduction", Brian Bockelman, HSF & ATLAS Joint Event Delivery Workshop
- 18 Sep 2019 - "Data Lakes, Data Caching for Science and the OSIRIS Distributed Storage System", Shawn McKee, The Global Research Platform Workshop
- 12 Sep 2019 - "DOMA: Preparing for Year 2 and Year 4", Brian Bockelman, IRIS-HEP Institute Retreat
- 3 Sep 2019 - "Data Organization, Management, and Access", Brian Bockelman, IRIS-HEP Steering Board Meeting
- 15 Aug 2019 - "Update on the Center for Research in Open Source Software", Carlos Maltzahn, Seminar at New Mexico Consortium, Los Alamos
- 31 Jul 2019 - "CMS XCache Monitoring Dashboard", Diego Davila, OSG Area Coordination
- 30 Jul 2019 - "StashCache: A Distributed Caching Federation for the Open Science Grid", Derek Weitzel, Practice and Experience in Advanced Research Computing
- 17 Jul 2019 - "Delivery of columnar data to analysis systems", Marc Weinberg, Annual US ATLAS Computing, Software and Physics Support Technical Meeting
- 8 Jul 2019 - "XCache Initiatives and Experiences", Frank Wuerthwein, pre-GDB meeting on XCache
- 5 Jul 2019 - "Reducing Disk Needs with a Data Lake Concept", Frank Wuerthwein, ECoM2x meeting CMS
- 24 Jun 2019 - "Delivery of columnar data to analysis systems", Marc Weinberg, ATLAS Software & Computing Week #62
- 20 Jun 2019 - "MBWU: Benefit Quantification for Data Access Function Offloading", Carlos Maltzahn, HPC I/O in the Data Center Workshop (HPC-IODC 2019)
- 19 Jun 2019 - "ServiceX", Ben Galewsky, Analysis Systems Topical Workshop
- 12 Jun 2019 - "XRootD and HTTP performance studies", Edgar Fajardo, XrootD Workshop@CC-IN2P3
- 11 Jun 2019 - "OSG Data Federation", Derek Weitzel, XRootD Workshop at CC-IN2P3
- 4 Jun 2019 - "Data Lake — A site perspective", Frank Wuerthwein, DOMA Access Meeting WLCG
- 4 Jun 2019 - "Proposal for Sharing of Data on Data Access", Diego Davila, DOMA / ACCESS Meeting
- 28 May 2019 - "WLCG DOMA TPC Working Group", Brian Bockelman, US CMS Tier-2 Facilities May 2019 Meeting
- 24 Apr 2019 - "Skyhook: Programmable Object Storage for Analysis", Jeff LeFevre, IRIS-HEP Topical Meetings
- 20 Mar 2019 - "Data Access in DOMA", Frank Wuerthwein, HOW2019 (Joint HSF/OSG/WLCG Workshop)
- 20 Mar 2019 - "Data Access at HPC: ATLAS/CMS Perspective", Frank Wuerthwein, HOW2019 (Joint HSF/OSG/WLCG Workshop)
- 20 Mar 2019 - "WLCG DOMA TPC Updates", Brian Bockelman, 2019 Joint HSF/OSG/WLCG Workshop (HOW2019)
- 13 Mar 2019 - "How to Leverage Research Universities", Carlos Maltzahn, Linux Foundation Open Source Leadership Summit (OSLS 2019)
- 6 Mar 2019 - "Scientific Data Lifecycle: Perspectives from an LHC Physicist", Shawn McKee, MAGIC Meeting
- 26 Feb 2019 - "Skyhook: programmable storage for databases", Jeff LeFevre, Vault'19
- 26 Feb 2019 - "Skyhook: programmable storage for databases", Carlos Maltzahn, Vault'19
- 6 Feb 2019 - "IRIS-HEP DOMA", Brian Bockelman, IRIS-HEP Steering Board Meeting
- 25 Jan 2019 - "Programmable Storage Systems: For I/O that doesn’t fit under the rug", Carlos Maltzahn, Seminar at Amazon AWS
- 14 Dec 2018 - "IN53A-04: Reproducible, Automated and Portable Computational and Data Science Experimentation Pipelines with Popper (with Ivo Jimenez)", Carlos Maltzahn, IN53A: Enabling Transparency and Reproducibility in Geoscience Through Practical Provenance and Cloud-Based Workflows I (AGU Fall Meeting)
- 11 Dec 2018 - "Programmable Storage Systems: For I/O that doesn’t fit under the rug", Carlos Maltzahn, Seminar at VMware
- 2 Oct 2018 - "Current production use of caching for CMS in Southern California", Edgar Fajardo, DOMA / ACCESS Meeting
- 16 Nov 2017 - "SkyhookDB - Leveraging object storage toward database elasticity in the cloud", Jeff LeFevre, DOMA Workshop 2017 (Flatiron Institute)
DOMA Publications
- Analysis Facilities White Paper, D. Ciangottini et. al., arXiv 2404.02100 (02 Apr 2024).
- Physics analysis for the HL-LHC: Concepts and pipelines in practice with the Analysis Grand Challenge, A. Held, E. Kauffman, O. Shadura and A. Wightman, EPJ Web Conf. 295 06016 (2024) (05 Jan 2024).
- Machine Learning for Columnar High Energy Physics Analysis, E. Kauffman, A. Held and O. Shadura, EPJ Web Conf. 295 08011 (2024) (03 Jan 2024).
- Data Management Package for the novel data delivery system, ServiceX, and Applications to various physics analysis workflows, K. Choi and P. Onyisi, EPJ Web Conf. 295 06008 (2024) (01 Jan 2024).
- Coffea-Casa: Building composable analysis facilities for the HL-LHC, S. Albin, G. Attebury, K. Bloom, B. Bockelman, C. Lundstedt, O. Shadura and J. Thiltges, EPJ Web Conf. 295 07009 (2024) (30 Nov 2023).
- Analyzing Transatlantic Network Traffic over Scientific Data Caches, Ziyue Deng, Alex Sim, Kesheng Wu, Chin Guok, Inder Monga, Fabio Andrijauskas, Frank Wuerthwein, and Derek Weitzel. Analyzing Transatlantic Network Traffic over Scientific Data Caches. in ACM 6th International Workshop on System and Network Telemetry and Analytics (SNTA'23) (20 Jun 2023).
- First performance measurements with the Analysis Grand Challenge, Oksana Shadura, Alexander Held, arXiv:2304.05214 [hep-ex] (Submitted to ACAT 2022) (12 Apr 2023).
- Effectiveness and predictability of in-network storage cache for Scientific Workflows, Caitlin Sim, Kesheng Wu, Alex Sim, Inder Monga, Chin Guok, Frank Würthwein, Diego Davila, Harvey Newman, and Justas Balcas. 2023 (22 Feb 2023).
- The IRIS-HEP Analysis Grand Challenge, A. Held and O. Shadura, Unknown (26 Nov 2022) [4 citations].
- Managed Network Services for Exascale Data Movement Across Large Global Scientific Collaborations, F. Wurthwein, J. Guiang, A. Arora, D. Davila, J. Graham, D. Mishin, T. Hutton, I. Sfiligoi, H. Newman, J. Balcas, T. Lehman, X. Yang, and C. Guok, Managed network services for exascale data movement across large global scientific collaborations, in 2022 4th Annual Workshop on Extreme-scale Experiment-in-the-Loop Computing (XLOOP), (Los Alamitos, CA, USA), pp. 16–19, IEEE Computer Society, November, 2022. (14 Nov 2022).
- Processing Particle Data Flows with SmartNICs, Jianshen Liu, Carlos Maltzahn, Matthew L. Curry, Craig Ulmer. Towards an Arrow-native Storage System. 2022 IEEE High Performance Extreme Computing Conference (HPEC), Virtual, September 19-23, 2022. Outstanding Student Paper (19 Sep 2022).
- Snowmass 2021 Computational Frontier CompF4 Topical Group Report Storage and Processing Resource Access, W. Bhimji et. al., Comput.Softw.Big Sci. 7 5 (2023) (19 Sep 2022) [2 citations].
- Mapping Out the HPC Dependency Chaos, Farid Zakaria, Thomas R. W. Scogland, Todd Gamblin, and Carlos Maltzahn. Mapping out the hpc dependency chaos. In SC22, Dallas, TX, November 13-18 2022. (27 Aug 2022).
- HSF IRIS-HEP Second Analysis Ecosystem Workshop Report, 10.5281/zenodo.7003962 (17 Aug 2022).
- Integrating End-to-End Exascale SDN into the LHC Data Distribution Cyberinfrastructure, Jonathan Guiang, Aashay Arora, Diego Davila, John Graham, Dima Mishin, Igor Sfiligoi, Frank Wuerthwein, Tom Lehman, Xi Yang, Chin Guok, Harvey Newman, Justas Balcas, and Thomas Hutton. 2022. Integrating End-to-End Exascale SDN into the LHC Data Distribution Cyberinfrastructure. In Practice and Experience in Advanced Research Computing (PEARC '22). Association for Computing Machinery, New York, NY, USA, Article 53, 1–4. https://doi.org/10.1145/3491418.3535134 (08 Jul 2022).
- Expanding the Scope of Artifact Evaluation at HPC Conferences: Experience of SC21, Tanu Malik, Anjo Vahldiek-Oberwagner, Ivo Jimenez, Carlos Maltzahn. Expanding the Scope of Artifact Evaluation at HPC Conferences: Experience of SC21. 5th International Workshop on Practical Reproducible Evaluation of Computer Systems (P-RECS), Virtual, June 30, 2022. (30 Jun 2022).
- Skyhook: Towards an Arrow-Native Storage System, Chakraborty, J., Jimenez, I., Rodriguez, S.A., Uta, A., LeFevre, J. and Maltzahn, C., 2021. Towards an Arrow-native Storage System. The 22nd IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGrid'22), Taormina (Messina), Italy, May 16-19, 2022. (16 May 2022).
- Access Trends of In-network Cache for Scientific Data, Ruize Han, Alex Sim, Kesheng Wu, Inder Monga, Chin Guok, Frank Würthwein, Diego Davila, Justas Balcas, Harvey Newman. 2022 (11 May 2022).
- Studying Scientific Data Lifecycle in On-demand Distributed Storage Caches, Julian Bellavita, Alex Sim, Kesheng Wu, Inder Monga, Chin Guok, Frank Würthwein, Diego Davila. 2022 (11 May 2022).
- Collaborative Computing Support for Analysis Facilities Exploiting Software as Infrastructure Techniques, M. Flechas, G. Attebury, K. Bloom, B. Bockelman, L. Gray, B. Holzman, C. Lundstedt, O. Shadura, N. Smith and J. Thiltges, arXiv 2203.10161 (18 Mar 2022) [2 citations].
- Analysis Facilities for HL-LHC, D. Benjamin et. al., arXiv 2203.08010 (15 Mar 2022) [3 citations].
- Zero-Cost, Arrow-Enabled Data Interface for Apache Spark, Rodriguez, S.A., Chakraborty, J., Chu, A., Jimenez, I., LeFevre, J., Maltzahn, C. and Uta, A., 2021. Zero-Cost, Arrow-Enabled Data Interface for Apache Spark. 2021 IEEE International Conference on Biog Data (IEEE BigData 2021), Virtual, December 15-18, 2021. (27 Nov 2021).
- Zero-Cost, Arrow-Enabled Data Interface for Apache Spark, Rodriguez, S.A., Chakraborty, J., Chu, A., Jimenez, I., LeFevre, J., Maltzahn, C. and Uta, A., 2021. Zero-Cost, Arrow-Enabled Data Interface for Apache Spark. arXiv preprint arXiv:2106.13020. (27 Nov 2021).
- SkyhookDM is now a part of Apache Arrow!, (25 Oct 2021).
- Towards Real-World Applications of ServiceX, an Analysis Data Transformation System, K. Choi, A. Eckart, B. Galewsky, R. Gardner, M. Neubauer, P. Onyisi, M. Proffitt, I. Vukotic and G. Watts (23 Aug 2021).
- Analyzing scientific data sharing patterns for in-network data caching, Elizabeth Copps, Huiyi Zhang, Alex Sim, Kesheng Wu, Inder Monga, Chin Guok, Frank Würthwein, Diego Davila, and Edgar Fajardo. 2021 (21 Jun 2021).
- Towards an Arrow-native Storage System, Chakraborty, J., Jimenez, I., Rodriguez, S.A., Uta, A., LeFevre, J. and Maltzahn, C., 2021. Towards an Arrow-native Storage System. arXiv preprint arXiv:2105.09894. (21 May 2021).
- Systematic benchmarking of HTTPS third party copy on 100Gbps links using XRootD, Fajardo, Edgar, Aashay Arora, Diego Davila, Richard Gao, Frank Würthwein, and Brian Bockelman, arXiv:2103.12116 (2021). (Submitted to CHEP 2019) (22 Mar 2021).
- Coffea-casa: an analysis facility prototype, M. Adamec, G. Attebury, K. Bloom, B. Bockelman, C. Lundstedt, O. Shadura and J. Thiltges, EPJ Web Conf. 251 02061 (2021) (02 Mar 2021) [13 citations] [NSF PAR].
- An intelligent Data Delivery Service for and beyond the ATLAS experiment, W. Guan, T. Maeno, B. Bockelman, T. Wenaus, F. Lin, S. Padolski, R. Zhang and A. Alekseev, EPJ Web Conf. 251 02007 (2021) (28 Feb 2021) [8 citations] [NSF PAR].
- ServiceX A Distributed, Caching, Columnar Data Delivery Service, ServiceX A Distributed, Caching, Columnar Data Delivery Service B. Galewsky, R. Gardner, L. Gray, M. Neubauer, J. Pivarski, M. Proffitt, I. Vukotic, G. Watts, M. Weinberg EPJ Web Conf. 245 04043 (2020) DOI: 10.1051/epjconf/202024504043 (16 Nov 2020).
- Enabling Seamless Execution of Computational and Data Science Workflows on HPC and Cloud with the Popper Container-Native Automation Engine, Chakraborty J, Maltzahn C, and Jimenez I. 2020 2nd International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC (CANOPIE-HPC, co-located with SC'20) (12 Nov 2020).
- Reproducible, Scalable Benchmarks for SkyhookDM using Popper, Chakraborty J. IRIS-HEP Summer 2020 Fellowship Report (12 Oct 2020).
- Scale-out Edge Storage Systems with Embedded Storage Nodes to Get Better Availability and Cost-Efficiency At the Same Time, Jianshen Liu, Matthew Leon Curry, Carlos Maltzahn, and Philip Kufeldt, 3rd USENIX Workshop on Hot Topics in Edge Computing (HotEdge ’20), Santa Clara, CA, June 25-26 2020 (26 May 2020).
- SkyhookDM: Data Processing in Ceph with Programmable Storage, Jeff LeFevre and Carlos Maltzahn, USENIX ;login: Magazine (12 May 2020).
- Towards an Intelligent Data Delivery Service, Wen Guan, Tadashi Maeno, Gancho Dimitrov, Brian Paul Bockelman, Torre Wenaus, Vakhtang Tsulaia, Nicolo Magini, CHEP2019 (14 Mar 2020).
- Is big data performance reproducible in modern cloud networks?, Alexandru Uta, Alexandru Custura, Dmitry Duplyakin, Ivo Jimenez, Jan Rellermeyer, Carlos Maltzahn, Robert Ricci, and Alexandru Iosup, 17th USENIX Symposium on Networked Systems Design and Implementation (NSDI ’20), Santa Clara, CA, February 25-27 2020 (26 Feb 2020).
- Scaling databases and file APIs with programmable Ceph object storage, Jeff LeFevre and Carlos Maltzahn, 2020 Linux Storage and Filesystems Conference (Vault'20, co-located with FAST'20 and NSDI'20), Santa Clara, CA, February 24-25 2020 (24 Feb 2020).
- Popper 2.0: A Container-native Workflow Execution Engine For Testing Complex Applications and Validating Scientific Claims, Jayjeet Chakraborty, Ivo Jimenez, Carlos Maltzahn, Arshul Mansoori, and Quincy Wofford, Poster at 2020 Exaxcale Computing Project Annual Meeting, Houston, TX, February 3-7, 2020, 2020 (03 Feb 2020).
- Towards Physical Design Management in Storage Systems, Kathryn Dahlgren, Jeff LeFevre, Ashay Shirwadkar, Ken Iizawa, Aldrin Montana, Peter Alvaro, Carlos Maltzahn, 4th International Parallel Data Systems Workshop (PDSW 2019, co-located with SC’19), Denver, CO, November 18, 2019. (18 Nov 2019) [NSF PAR].
- SkyhookDM: Mapping Scientific Datasets to Programmable Storage, Aaron Chu and Jeff LeFevre and Carlos Maltzahn and Aldrin Montana and Peter Alvaro and Dana Robinson and Quincey Koziol, arXiv:2007.01789 [cs.DS] (Submitted to CHEP 2019) (08 Nov 2019).
- Third-party transfers in WLCG using HTTP, Brian Bockelman and Andrea Ceccanti and Fabrizio Furano and Paul Millar and Dmitry Litvintsev and Alessandra Forti, arXiv:2007.03490 [cs.DC] (Submitted to CHEP 2019) (08 Nov 2019).
- Reproducible Computer Network Experiments: A Case Study Using Popper, Andrea David, Mariette Souppe, Ivo Jimenez, Katia Obraczka, Sam Mansfield, Kerry Veenstra, Carlos Maltzahn, 2nd International Workshop on Practical Reproducible Evaluation of Computer Systems (P-RECS, co-located with HPDC’19), Phoenix, AZ, June 24, 2019. (24 Jun 2019).
- MBWU: Benefit Quantification for Data Access Function Offloading, Jianshen Liu, Philip Kufeldt, Carlos Maltzahn, HPC I/O in the Data Center Workshop (HPC-IODC 2019, co-located with ISC-HPC 2019), Frankfurt, Germany, June 20, 2019. (20 Jun 2019).
- Skyhook: Programmable storage for databases, Jeff LeFevre, Noah Watkins, Michael Sevilla, and Carlos Maltzahn, 2020 Linux Storage and Filesystems Conference (Vault'19, co-located with FAST'19), Santa Clara, CA, February 25-26 2019 (25 Feb 2019).
- Spotting Black Swans With Ease: The Case for a Practical Reproducibility Platform, Ivo Jimenez, Carlos Maltzahn, st Workshop on Reproducible, Customizable and Portable Workflows for HPC (ResCuE-HPC’18, co-located with SC’18), Dallas, TX, November 11, 2018. (11 Nov 2018).
- Taming performance variability, Aleksander Maricq, Dmitry Duplyakin, Ivo Jimenez, Carlos Maltzahn, Ryan Stutsman, and Robert Ricci, 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI’18), Carlsbad, CA, October 8-10, 2018. (08 Oct 2018).
Join us
We collaborate with groups around the world on code, data, and more. See our project pages for more.