Project description
The OSG and WLCG collaborations have been operating a global Research & Education (R&E) network monitoring infrastructure for more than 10 years. Currently data from over 230 perfSONAR instances are centrally gathered into an Elasticsearch cluster at the University of Chicago. This rich dataset should be able to provide visibility into our R&E networks and identify various types of network problems as they occur.
pSDash
is designed to track, visualize, and filter for network issues based upon the perfSONAR and related network monitoring data being gathered by OSG and WLCG services. The project is complementary to the Alarms&Alerts service (for which anyone can subscribe; see below) and currently covers the following problems:
- Throughput issues
- High packet loss
- Firewall issues
- Bad clock configurations
- Traceroute issues
- Divergence from the usual path (based on the AS numbers)
- And more
You can access the application at https://ps-dash.uc.ssl-hep.org/ and subscribe to receive emails about the most recent alarms at https://aaas.atlas-ml.org/ (make sure you add a tag if you are interested in specific sites).
pSDash code Alarms&Alerts Service code
Contact us
Let us know if you find any issues or if you have comments/suggestions by opening a ticket with OSG support staff: email help@osg-htc.org and request support for networking/pSDash.
Team
- Marian Babik
- Shawn McKee
- Petya Vasileva
- Ilija Vukotic
Presentations
- 30 Jun 2023 - "Network problem detection and localisation using perfSonar measurements", Petya Vasileva, SoX Monthly Workshop
- 10 Mar 2023 - "Tools and methods for tracking network issues based on perfSONAR datasets", Petya Vasileva, CI Engineering Lunch & Learn Series