Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

main exit #912

Merged
merged 3 commits into from
Oct 23, 2024
Merged

main exit #912

merged 3 commits into from
Oct 23, 2024

Conversation

baum
Copy link
Collaborator

@baum baum commented Oct 22, 2024

main exit

  1. do not catch System exit
  2. return exit code 0 from TERM signal handler
  3. tweak wait for gateways ha test

partial revert of

@baum baum force-pushed the main_exit branch 3 times, most recently from 2cdae48 to 6551853 Compare October 22, 2024 11:44
1. Do not catch System exit
2. Return exit code 0 from TERM signal handler

partial revert of
- 6c120cf
- 042cf8c

Signed-off-by: Alexander Indenbaum <[email protected]>
@baum baum force-pushed the main_exit branch 2 times, most recently from 25f8504 to 5522e1d Compare October 22, 2024 12:42
Alexander Indenbaum added 2 commits October 22, 2024 12:51
Signed-off-by: Alexander Indenbaum <[email protected]>
Signed-off-by: Alexander Indenbaum <[email protected]>
@baum baum merged commit 76aed58 into ceph:devel Oct 23, 2024
43 checks passed
@VallariAg
Copy link
Member

Teuthology run: https://pulpito.ceph.com/vallariag-2024-10-22_17:54:08-nvmeof-main-distro-default-smithi/

Explanation of 1 failed job:
During HA testing, ceph orch daemon rm nvmeof.nvmeof.a triggered cluster warning CEPHADM_DAEMON_PLACE_FAIL. It was cleared after the daemon restarted. I reran it and the test passed: https://pulpito.ceph.com/vallariag-2024-10-24_07:48:47-nvmeof-main-distro-default-smithi/

@caroav
Copy link
Collaborator

caroav commented Oct 24, 2024

@VallariAg does it add a test to verify that if a gw exits with a bad status its being restarted? I guess no?

@VallariAg
Copy link
Member

VallariAg commented Oct 24, 2024

@caroav it includes HA testing (thrasher test) and that passed. Do you mean self-restart after thrashing? (if yes, then yeah, we are not checking for self-restart - let me know and I can add that)

@caroav
Copy link
Collaborator

caroav commented Oct 24, 2024

this is a trashing case that we still need to add testing for in atom and teuthology. We need to kill one of {spdk, monitor client}, and see that the gw is exiting, but then being restarted automatically. Both cases should be checked (kill spdk, and in another iteration kill monitor client).

@VallariAg
Copy link
Member

@caroav ah okay, understood, will talk to Barak about it too next week. thank you!

@barakda barakda mentioned this pull request Oct 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants