Analysis Grand Challenge
Analysis Grand Challenge
The most up-to-date documentation about the Analysis Grand Challenge is located at this website: https://agc.readthedocs.io/en/latest/
The large increase in data volume at the HL-LHC requires rethinking how physicists interact with the data when developing and performing analysis. In addition to raw throughput, it is critical that analysis systems are flexible, easy to use, and have low latency to facilitate the design stages. The Analysis Grand Challenge was designed to span the scope of Analysis Systems focus area, transverse a vertical slice through the tools being developed by the Analysis Systems focus area, and increase intra-Institute connections with DOMA and SSL. The goal is to demonstrate that the analysis system can not only cope with the increased data volume, but can also deliver enhanced functionality compared to the analysis systems used at the LHC today. The challenge is formulated as a user story with assumptions, and acceptance criteria.
The Analysis Grand Challenge includes both integration of software components for analyzing the data as well as the deployment of the analysis software at analysis facilities. The vertical slice implements the functionality needed for a prototypical analysis use case with a moderately complex analysis with multiple event selection requirements, observables to be histogrammed, and systematic uncertainties that must be taken into account. The image below gives an overview of the software tools that must be integrated for this vertical slice.
In addition, the challenge incorporates enhanced functionality relative to the the analysis systems used at the LHC today
- End-to-end analysis optimization including systematics on a realistically sized HL-LHC (∼200 TB) end-user analysis dataset
- Analysis Preservation & Reinterpretation: The ability to preserve the optimized analysis (in git repositories, docker images, workflow components, etc.), reproduce results, and reinterpret the analysis with a new signal hypothesis.
The inclusion of differentiable programming, a relatively new concept in HEP, into the challenge carries some risks. We note however that it has the potential to move the field forward in several important ways:
- Intellectual Leadership: It is a modern paradigm growing and abstracting from the success of deep learning, and a more natural fit to HEP than replacing everything with machine learning.
- Increased Functionality: We will have more sensitive analyses. Differentiable analysis systems would accelerate and improve essentially all fitting / tuning / optimization tasks. It also facilitates propagation of uncertainty in a more powerful way and paves the way to hybrid systems that fuse traditional approaches and machine learning more seamlessly.
- Connection with Industry: This has been an effective conduit to connections with Google (Jax and Tensorflow teams) and the pytorch community.
- Foster Innovation: Differentiable programming opens up a new range of possibilities for performing analysis in physics at the HL-LHC.
- Training & Workforce development: Young people entering the job market with Machine Learning and Differentiable Programming skills will have a unique and valuable skill set. Differentiable programming will force physicists to take an end-to-end approach to problem solving, a skill that is already looked for both within and outside HEP.
This challenge involves milestones and deliverables in DOMA, Analysis Systems, and SSL. Year 3 will include a Blueprint and other meetings focused on scoping of the target analysis, the needed capabilities of an analysis system, and roadmaps for how the components will interact. Year 4 will include initial benchmarking of analysis system components and integrations, and we aim to execute the analysis grand challenge in Year 5.
AGC Presentations
- 24 Oct 2024 - "Tuning the CMS Coffea-casa facility for 200 Gbps Challenge", Oksana Shadura, Conference on Computing in High Energy and Nuclear Physics (CHEP 2024)
- 4 Sep 2024 - "AGC & IDAP / 200 Gbps", Oksana Shadura, IRIS-HEP Institute Retreat 2024
- 18 Jun 2024 - "Analysis Grand Challenge", Oksana Shadura, Analysis Facilities Workshop
- 11 Jun 2024 - "Coffea-casa and 200 Gbps challenge - experience with Kubernetes", Oksana Shadura, 2024 All-Hands Workshop of the U.S. CMS Software and Computing Operations Program
- 9 Jun 2024 - "The 200 Gbps Challenge at Nebraska", Oksana Shadura, US CMS Analysis Facility Meeting
- 15 Mar 2024 - "Introduction - IRIS-HEP Data Analysis Pipeline (IDAP)", Oksana Shadura, IRIS-HEP Data Analysis Pipeline (IDAP) meeting
- 5 Mar 2024 - "Analysis Grand Challenge (AGC)", Oksana Shadura, US CMS Analysis Facility Meeting
- 10 Jan 2024 - "AGC Deep Dive", Oksana Shadura, NSF / IRIS-HEP Meeting (January 2024)
- 12 Sep 2023 - "Current Plans for AGC", Oksana Shadura, IRIS-HEP Institute Retreat
- 11 Sep 2023 - "Focus Area - Analysis Grand Challenge", Oksana Shadura, IRIS-HEP Institute Retreat
- 27 Jul 2023 - "Analysis Grand Challenge & Coffea-Casa analysis facility as a test environment for packages and services", Oksana Shadura, PyHEP.dev 2023 - "Python in HEP" Developer's Workshop
- 24 Jul 2023 - "Analysis Grand Challenge Demo - Hands-on Demo Session", Oksana Shadura, Computational HEP Traineeship Summer School
- 11 Jul 2023 - "Analysis Grand Challenge", Oksana Shadura, IRIS-HEP / Ops Program Analysis Grand Challenge Planning
- 15 Nov 2022 - "Analysis Grand Challenge updates", Oksana Shadura, IRIS-HEP / Ops Program Analysis Grand Challenge Planning
- 25 Oct 2022 - "First performance measurements with the Analysis Grand Challenge", Oksana Shadura, ACAT 2022
- 12 Oct 2022 - "AGC - Perspective from focus areas and projects", Oksana Shadura, IRIS-HEP Institute Retreat
- 4 Oct 2022 - "Analysis Grand Challenge updates", Oksana Shadura, IRIS-HEP / Ops Program Analysis Grand Challenge Planning
- 25 Apr 2022 - "IRIS-HEP Analysis Grand Challenge Tools Workshop", Oksana Shadura, RIS-HEP AGC Tools 2022 Workshop
- 5 Apr 2022 - "Analysis Grand Challenge updates", Oksana Shadura, IRIS-HEP / Ops Program Analysis Grand Challenge Planning
- 28 Jan 2022 - "Analysis Grand Challenge updates", Oksana Shadura, IRIS-HEP / Ops Program Analysis Grand Challenge Planning
- 16 Dec 2021 - "Analysis Grand Challenge updates", Oksana Shadura, IRIS-HEP Executive Board / Ops Program Grand Challenge Discussion
- 17 Nov 2021 - "Deep Dive - Analysis Grand Challenge", Oksana Shadura, NSF / IRIS-HEP Meeting (November 2021)
- 2 Nov 2021 - "Analysis Grand Challenge", Oksana Shadura, SwiftHep/ExcaliburHep workshop
AGC Publications
- Physics analysis for the HL-LHC: Concepts and pipelines in practice with the Analysis Grand Challenge, A. Held, E. Kauffman, O. Shadura and A. Wightman, EPJ Web Conf. 295 06016 (2024) (05 Jan 2024).
- Machine Learning for Columnar High Energy Physics Analysis, E. Kauffman, A. Held and O. Shadura, EPJ Web Conf. 295 08011 (2024) (03 Jan 2024).
- Coffea-Casa: Building composable analysis facilities for the HL-LHC, S. Albin, G. Attebury, K. Bloom, B. Bockelman, C. Lundstedt, O. Shadura and J. Thiltges, EPJ Web Conf. 295 07009 (2024) (30 Nov 2023).
- First performance measurements with the Analysis Grand Challenge, Oksana Shadura, Alexander Held, arXiv:2304.05214 [hep-ex] (Submitted to ACAT 2022) (12 Apr 2023).
- The IRIS-HEP Analysis Grand Challenge, A. Held and O. Shadura, Unknown (26 Nov 2022) [4 citations].
Join us
We collaborate with groups around the world on code, data, and more. See our project pages for more.