-
Notifications
You must be signed in to change notification settings - Fork 1.5k
feat(vault): Fix monitoring when vault is behind load balancer, add more assertive has_leader metric #18123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…ore assertive has_leader metric
| has_leader_tags = list(self._tags) | ||
| has_leader_tags.extend(dynamic_tags) | ||
| has_leader_tags.append(f'current_leader:{current_leader}') | ||
| submission_queue.append(lambda tags: self.gauge('has_leader', int(has_leader), tags=has_leader_tags)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| submission_queue.append(lambda tags: self.gauge('has_leader', int(has_leader), tags=has_leader_tags)) | |
| submission_queue.append(lambda tags: self.gauge('vault.has_leader', int(has_leader), tags=has_leader_tags)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@steveny91 it is working as expected the way it is in the PR, if you see line 89 on the same file you'll notice that the existing is_leader gauge doesn't need the vault. prefix either and it's being submitted the same way :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh sorry it wasn't this one. But the one in vault.py that needed the namespace.
|
@rsafonseca Thanks for the contribution. I left a small correction. I'll test the PR a bit later today as the tests are failing. |
| has_leader_tags = list(self._tags) | ||
| has_leader_tags.extend(dynamic_tags) | ||
| has_leader_tags.append('current_leader:{}'.format(current_leader)) | ||
| submission_queue.append(lambda tags: self.gauge('has_leader', int(has_leader), tags=has_leader_tags)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| submission_queue.append(lambda tags: self.gauge('has_leader', int(has_leader), tags=has_leader_tags)) | |
| submission_queue.append(lambda tags: self.gauge('vault.has_leader', int(has_leader), tags=has_leader_tags)) |
Following line 153
|
@rsafonseca Thanks again for the contribution. I think barring that small edit, this can be merged and shipped with the next agent. I created a PR to update the tests that can be merged in after this PR gets merged: I'm no expert on vault myself so I was hoping you can provide some additional clarity on the points you made:
Are non leader node(s) always on standby? If not, what is the behavior of nodes that aren't the leader but are active? I found this issue: hashicorp/vault#24401, that somewhat mentions this redirect, but it goes on about standby so hoping to get a better answer here. |
What does this PR do?
Motivation
Additional Notes
N/A
Review checklist (to be filled by reviewers)
qa/skip-qalabel if the PR doesn't need to be tested during QA.backport/<branch-name>label to the PR and it will automatically open a backport PR once this one is merged