-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding an alert mechanism into lvm-operator #103
Adding an alert mechanism into lvm-operator #103
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to just create a static YAML file for this?
Please update the commit msg to match the commitlint requirements |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussing This with @aruniiird I understand that we can create/update the YAML files using the jsonnet
tool via just changing the values in the monitoring/config.libsonnet
file instead of changing the YAML file directly if we require any change in the future. Also, we can create different YAML files too using this tool for different use cases eg. Managed Services, snow, etc.
So the question to @nbalacha is do we really want to have that flexibility? if not we can create a static file too as you suggested.
If we are gonna stick to the same format we need to make sure that
- Make target also download
jsonnet
tool if not already present on the machine. - We should use kustomize to use metadata from a different file and spec from a different file as this tool just create only spec and not metadata at all.
- We should also upload the generated YAML file.
ddb5d9e
to
dcceb04
Compare
b1a071e
to
c9b6f94
Compare
c9b6f94
to
00b8875
Compare
3a5e73a
to
61aaa69
Compare
monitoring/alerts/vgalerts.libsonnet
Outdated
}, | ||
annotations: { | ||
description: 'VolumeGroup is nearing full. Data deletion or VolumeGroup expansion is required.', | ||
message: 'VolumeGroup {{ $labels.device_class }} utilization has crossed 75% percent on node {{ $labels.node }}. Free up some space or expand the VolumeGroup.', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
message: 'VolumeGroup {{ $labels.device_class }} utilization has crossed 75% percent on node {{ $labels.node }}. Free up some space or expand the VolumeGroup.', | |
message: 'VolumeGroup {{ $labels.device_class }} utilization has crossed %(vgUsageThresholdNearFull)0.2f percent on node {{ $labels.node }}. Free up some space or expand the VolumeGroup.', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
monitoring/alerts/vgalerts.libsonnet
Outdated
}, | ||
annotations: { | ||
description: 'VolumeGroup is critically full. Data deletion or VolumeGroup expansion is required.', | ||
message: 'VolumeGroup {{ $labels.device_class }} utilization has crossed 85% percent on node {{ $labels.node }}. Free up some space or expand the VolumeGroup immediately.', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
message: 'VolumeGroup {{ $labels.device_class }} utilization has crossed 85% percent on node {{ $labels.node }}. Free up some space or expand the VolumeGroup immediately.', | |
message: 'VolumeGroup {{ $labels.device_class }} utilization has crossed %(vgUsageThresholdCritical)0.2f percent on node {{ $labels.node }}. Free up some space or expand the VolumeGroup immediately.', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
monitoring/alerts/vgalerts.libsonnet
Outdated
annotations: { | ||
description: 'VolumeGroup is critically full. Data deletion or VolumeGroup expansion is required.', | ||
message: 'VolumeGroup {{ $labels.device_class }} utilization has crossed 85% percent on node {{ $labels.node }}. Free up some space or expand the VolumeGroup immediately.', | ||
severity_level: 'error', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
severity_level: 'error', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
monitoring/alerts/vgalerts.libsonnet
Outdated
annotations: { | ||
description: 'VolumeGroup is nearing full. Data deletion or VolumeGroup expansion is required.', | ||
message: 'VolumeGroup {{ $labels.device_class }} utilization has crossed 75% percent on node {{ $labels.node }}. Free up some space or expand the VolumeGroup.', | ||
severity_level: 'warning', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
severity_level: 'warning', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Complexity of generating alerts using mixin seems to be more than the complexity of maintaining a single YAML file. So, for now I'd suggest using just a single static YAML file.
Once it gets fairly complex to manage, you can move to mixins or some other solutions.
61aaa69
to
868d797
Compare
f3ee12c
to
38108b9
Compare
/cherry-pick release-4.10 |
@agarwal-mudit: once the present PR merges, I will cherry-pick it on top of release-4.10 in a new PR and assign it to you. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Adding an alert mixins framework into LVM Operator. Makefile is modified to generate the prometheus alert yaml file. Added 'VolumeGroupUsageAtThreshold' alert which will be triggered if volumegroup usage goes beyond the threshold percentage (default set to 75%) Signed-off-by: Arun Kumar Mohan <[email protected]>
Signed-off-by: Nitin Goyal <[email protected]>
Signed-off-by: Nitin Goyal <[email protected]> Signed-off-by: Arun Kumar Mohan <[email protected]>
38108b9
to
56f94ed
Compare
We did discuss whether we should just use a static YAML and decided to put that effort in now in order to make it easier in the future. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aruniiird, nbalacha, umangachapagain The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@agarwal-mudit: new pull request created: #121 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Adding an alert mixins framework into LVM Operator.
Makefile is modified to generate the prometheus alert yaml file.
Added 'VolumeGroupUsageAtThreshold' alert which will be triggered if
volumegroup usage goes beyond the threshold percentage (default set to 75%)
Signed-off-by: Arun Kumar Mohan [email protected]