Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test different resharing senarios and fix some bugs #692

Merged
merged 13 commits into from
Jul 29, 2024

Conversation

ailisp
Copy link
Member

@ailisp ailisp commented Jul 18, 2024

When test resharing I found some problem:

  1. new participant not generating triples
  2. as participant join and leave, some node can panick and crash

The root cause of 1. is the new participant has url with trailing /. And node ping ip:port//state (double slash) fails therefore not joining it into active participant set. Although the test infra can manage to spin up node with out trailing /, our node should be handle this case more robustly. The fix is to replace every string url concat with Url.join.

The root cause of 2. is as participant join and leave, stable participant set could be temprorailly below threshold number of participant. The fix is to wait and soon new active participant will become stable and protocol will proceed.

Also merge and improve resharing tests to test more senarios: leave, join, rejoin

@ailisp
Copy link
Member Author

ailisp commented Jul 22, 2024

Seems the problem is more serious, log shows that new joined participants don't generate triplet at all. Still investigating

@ailisp ailisp force-pushed the new-node-triplet branch from dc9440f to 8b32202 Compare July 26, 2024 01:25
@ailisp ailisp changed the title new joined participant not generate enough triplets Test different resharing senarios and fix some bugs Jul 26, 2024
@ailisp ailisp marked this pull request as ready for review July 26, 2024 05:58
@ailisp
Copy link
Member Author

ailisp commented Jul 26, 2024

Do you guys know why cargo check fails? There is no relevant change or ci toolchain change since last passing CI. They both use stable 1.79.0. Using stable 1.79.0 cargo check succeed locally. I found this https://users.rust-lang.org/t/time-crate-compilation-error/111789/3 but it only fails for a nightly rustc

@volovyks
Copy link
Collaborator

@ailisp We have this issue on Phuongs PR, the reason is unknown. It works on my PC and on the GCP VM.

@ailisp ailisp requested a review from ChaoticTempest July 29, 2024 04:44
@ChaoticTempest ChaoticTempest merged commit 46599a1 into develop Jul 29, 2024
3 checks passed
@ChaoticTempest ChaoticTempest deleted the new-node-triplet branch July 29, 2024 05:52
Copy link

Terraform Feature Environment Destroy (dev-692)

Terraform Initialization ⚙️success

Terraform Destroy success

Show Destroy Plan


No changes. No objects need to be destroyed.

Either you have not created any objects yet or the existing objects were
already deleted outside of Terraform.

Destroy complete! Resources: 0 destroyed.

Pusher: @ChaoticTempest, Action: pull_request, Working Directory: ``, Workflow: Terraform Feature Env (Destroy)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants