The SuperSONIC project implements server infrastructure for inference-as-a-service applications in large high energy physics (HEP) and multi-messenger astrophysics (MMA) experiments. The server infrastructure is designed for deployment at Kubernetes clusters equipped with GPUs.
The main components of SuperSONIC are:
- Nvidia Triton inference servers
- Dynamic muti-purpose Envoy Proxy:
- Load balancing
- Rate limiting
- GPU saturation prevention
- Token-based authentication
- (optional) Load-based autoscaling via KEDA
- (optional) Prometheus instance (deploy custom or connect to existing)
- (optional) Pre-configured Grafana dashboard
- (optional) OpenTelemetry Collector and Grafana Tempo for advanced monitoring.
The installation is done via a custom Helm plugin which takes care of internal connectivity of the chart components. Standard Helm installation is also supported, but requires a lot more manual configuration.
helm plugin install https://github.com/fastmachinelearning/SuperSONIC/
helm install-supersonic <release-name> -n <namespace> -f <your-values.yaml>
Installer plugin usage:
Usage:
helm install-supersonic [RELEASE_NAME] [flags]
Flags:
-h, --help Show this help message
-f, --values Specify values file for custom configuration
-n, --namespace Specify Kubernetes namespace for deployment
--version Specify chart version (default: latest version)
Note: Ignored if --local flag is set
--local Install from local chart path instead of remote repository
--path Local chart path (default: ./helm/supersonic)
Only used when --local flag is set
Additional flags will be passed directly to the 'helm install' command
To construct the values.yaml
file for your application, follow Configuration guide.
The full list of configuration parameters is available in the Configuration reference.
CMS | ATLAS | IceCube | |
---|---|---|---|
Purdue Geddes | ✅ | - | - |
Purdue Anvil | ✅ | - | - |
NRP Nautilus | ✅ | ✅ | ✅ |
UChicago | - | ✅ | - |