Fix issue 679 #684

ailisp · 2024-07-16T10:55:24Z

There is a bug in resharing that only the last node can be vote to kicked. Removing any other node will cause resharing fail and nodes stuck. The root cause of the bug is cait-sith require each node keeps original ParticipantId (u32) in resharing, but in our contract->node if non last node is removed, remaining participants were re assigned with participant id 0, 1, ..., k-1. It should be preserving the original id.

I confirmed in playing with cait-sith's unit test, found ParticipantIds doesn't need to be consecutive, as long as the ParticipantIds are unique and remaining ParticipantId maps to original keyshare, reshare will succeed.

Therefore, the fix is to keep track of each account id and map them to the participant id. Also keep track of next available id. If a new participant joins, it gets a new id. If a participant leave, it's id will be vacant and remain participant will keep their id unchanged. If a previous participant rejoin, it gets it's original id.

In order to debug this issue, I make a lot of logging. Leaving some most important ones: logging state transitions in debug, and make some not useful debug logging to trace.

ailisp · 2024-07-17T04:58:33Z

Some of the logging added in this PR is helpful, but most are only helpful for debugging this bug. Also it is still hard to read log from integration tests if cannot easily tell which line is from which node. I'll remove most logs and make a separate PR for some logging enhancement

ailisp · 2024-07-18T08:09:13Z

chain-signatures/node/src/protocol/presignature.rs

@@ -407,7 +407,7 @@ impl PresignatureManager {
                };
                match action {
                    Action::Wait => {
-                        tracing::debug!("waiting");
+                        tracing::trace!("waiting");


these are too noisy in debug log

volovyks · 2024-07-18T13:31:26Z

integration-tests/chain-signatures/tests/cases/mod.rs

@@ -320,3 +320,31 @@ async fn test_multichain_reshare_with_lake_congestion() -> anyhow::Result<()> {
    })
    .await
 }
+
+#[test(tokio::test)]
+async fn test_multichain_reshare_with_first_node_leave() -> anyhow::Result<()> {


Should we extend the existing test with one more resharing? Not to make are test runs longer.

Sounds good

Not possible to expand the remove last participant test due to #699 . With 3 nodes, removing and add a new one will fail due to #699, so can only remove one participant in a test. With 5 nodes, threshold 3, remove two nodes test passed locally and should work on nightly, but it takes too much ram to run on regular ci. I'll merge these tests to one after figure out #699

volovyks · 2024-07-18T13:48:58Z

chain-signatures/contract/src/lib.rs

@@ -646,7 +646,7 @@ impl VersionedMpcContract {
        Self::V0(MpcContract {
            protocol_state: ProtocolContractState::Running(RunningContractState {
                epoch,
-                participants: Participants { participants },
+                participants: Participants::from_init_participants(participants),


Would the participant Ids stay the same when we move from one contract to another?

Yes, this assign each participant from 0 in order, same as the participant id before the change

So this probably won't work for migrations from one contract to another with an already running MPC network that has had multiple reshares. Participants::from_init_participants will create a different participant id mapping than the ones the MPC nodes have due to just enumerating over each item and getting their position.

We should have a new parameter for the account_id_to_participant value be passable into init_running so we can get the same mapping

Ah got it! Good point

I dropped Participants::from_init_participants, now init_running just takes struct Participants where node admin can get by view VersionedMpcContract::state()

ailisp · 2024-07-22T12:06:07Z

@volovyks ptal. There is a new audit issue will be addressed separately

ppca

LGTM!

github-actions · 2024-07-23T12:36:23Z

Terraform Feature Environment Destroy (dev-684)

Terraform Initialization ⚙️`success`

Terraform Destroy `success`

Show Destroy Plan



No changes. No objects need to be destroyed.

Either you have not created any objects yet or the existing objects were
already deleted outside of Terraform.

Destroy complete! Resources: 0 destroyed.

Pusher: @volovyks, Action: pull_request, Working Directory: ``, Workflow: Terraform Feature Env (Destroy)

ailisp added 3 commits July 16, 2024 18:54

wip fix issue 679

90b49b3

fixed

9b607bf

cleanup

b9e1e4f

ailisp marked this pull request as ready for review July 17, 2024 04:45

ailisp changed the title ~~wip fix issue 679~~ Fix issue 679 Jul 17, 2024

merge

5d2824f

unpatch

1ec5b32

ailisp marked this pull request as draft July 17, 2024 08:03

ailisp added 2 commits July 18, 2024 12:15

fix tests

48d7d55

clean up logging

8828691

ailisp commented Jul 18, 2024

View reviewed changes

ailisp marked this pull request as ready for review July 18, 2024 08:22

clippy

1d96998

ailisp mentioned this pull request Jul 18, 2024

Test different resharing senarios and fix some bugs #692

Merged

volovyks reviewed Jul 18, 2024

View reviewed changes

volovyks requested review from ChaoticTempest and ppca July 19, 2024 10:54

ailisp added 5 commits July 22, 2024 15:39

init_running with full participant info

fffa269

nit

eb89ed4

Merge branch 'develop' into fix-679

4eb6bc5

merge develop make test works again

77ffd6c

clippy

1c7dd36

ppca previously approved these changes Jul 22, 2024

View reviewed changes

ChaoticTempest previously approved these changes Jul 22, 2024

View reviewed changes

Merge branch 'develop' into fix-679

357c6a3

ailisp dismissed stale reviews from ChaoticTempest and ppca via 357c6a3 July 23, 2024 01:08

warning

207e7f2

ailisp requested review from volovyks, ChaoticTempest and ppca July 23, 2024 01:55

volovyks approved these changes Jul 23, 2024

View reviewed changes

volovyks merged commit ce157c4 into develop Jul 23, 2024
4 checks passed

volovyks deleted the fix-679 branch July 23, 2024 12:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix issue 679 #684

Fix issue 679 #684

ailisp commented Jul 16, 2024 •

edited

Loading

ailisp commented Jul 17, 2024

ailisp Jul 18, 2024

volovyks Jul 18, 2024

ailisp Jul 19, 2024

ailisp Jul 19, 2024

volovyks Jul 18, 2024

ailisp Jul 19, 2024

ChaoticTempest Jul 19, 2024

ailisp Jul 22, 2024

ailisp Jul 22, 2024

ailisp commented Jul 22, 2024

ppca left a comment

github-actions bot commented Jul 23, 2024

Fix issue 679 #684

Fix issue 679 #684

Conversation

ailisp commented Jul 16, 2024 • edited Loading

ailisp commented Jul 17, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ailisp commented Jul 22, 2024

ppca left a comment

Choose a reason for hiding this comment

github-actions bot commented Jul 23, 2024

Terraform Feature Environment Destroy (dev-684)

Terraform Initialization ⚙️success

Terraform Destroy success

ailisp commented Jul 16, 2024 •

edited

Loading

Terraform Initialization ⚙️`success`

Terraform Destroy `success`