About coffea-casa project

coffea-casa is a prototype of analysis facility, which provides services for “low latency columnar analysis”, enabling rapid processing of data in a column-wise fashion. These services, based on Dask and Jupyter notebooks, aim to dramatically lower time for analysis and provide an easily-scalable and user-friendly computational environment that will simplify, facilitate, and accelerate the delivery of HEP results. The facility is built on top of a Kubernetes cluster and integrates dedicated resources with resources allocated via fairshare through the local HTCondor system. In addition to the user-facing interfaces such as Dask, the facility also manages access control through single-sign-on and authentication & authorization for data access.

Generic design schema of coffea-casa analysis facility

GitHub Project GitHub issues GitHub pull requests Actions Status PyPI version PyPI platforms

Documentation: Documentation Status

Contact us: GitHub Discussion

Docker containers: Docker Pulls for coffea-casa Docker Pulls for coffea-casa (worker image)

More information could be found in the corresponding repository:

Recent accomplishments and plans

Recent accomplishments:

  • Deployed at University Nebraska-Lincoln Tier3, coffea-casa facility is ready to accomodate the first CMS users: try it!

Coffea-casa Jupyterlab interface with Dask Labextention powered cluster

Future plans for 2021:

  • Release Helm charts and other by-products to be deployable on the other facilities
  • Deploy coffea-casa functionality at least on one external facility
  • Involve more physics analysis groups to use facility.

Recent videos and tutorials

  • The coffea-casa introductory Youtube video at PyHEP 2020
  • The coffea-casa Youtube video tutorial at PyHEP 2020

Fellows

Team

Presentations

Publications