Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

contract: implement sign fn with yield/resume #628

Merged
merged 5 commits into from
Jul 16, 2024

Conversation

ppca
Copy link
Contributor

@ppca ppca commented Jun 4, 2024

in this new design, we don't store the SignatureResponse in contract state. We still 1) won't allow repeated request if it was already requested and not completed; 2) only allow 8 requests concurrently

Note this new sign fn will return nothing, but you could still use it as if it returns Promise in any cross contract calls.

@ppca ppca linked an issue Jun 5, 2024 that may be closed by this pull request
@ppca ppca force-pushed the xiangyi/contract_yield_resume branch from eb8f529 to 31c3f76 Compare June 11, 2024 18:56
@ppca ppca force-pushed the xiangyi/contract_yield_resume branch from 31c3f76 to fd69972 Compare June 21, 2024 21:49
@ppca ppca changed the title yield resume: add VersionedContract::V1 contract: implement sign fn with yield/resume Jun 25, 2024
@ppca ppca marked this pull request as ready for review June 25, 2024 18:32
Copy link
Collaborator

@volovyks volovyks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@@ -150,7 +184,7 @@ impl VersionedMpcContract {
/// we ask for a small deposit for each signature request.
/// The fee changes based on how busy the network is.
#[payable]
pub fn sign(&mut self, request: SignRequest) -> Promise {
pub fn sign(&mut self, request: SignRequest) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better to still return Promise? Without -> Promise, env::promise_return(yield_promise); will return yield_promise, but if yield_promise executes to fail, the transaction (of calling sign) is still success.

Copy link
Contributor Author

@ppca ppca Jul 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah,I noticed during my load tests with this code that the sign() tx result will show success until signature request timed out, which I think happens when it's past the number of blocks timeout yield/resume allows and still no result.
I tried load testing again, so the explorer will always show the status of tx as success, one example tx: one example tx: https://testnet.nearblocks.io/txns/BgkgGo3Xt4iWk9FdGeiJfSYtdQnVexzQS8DsnpaDUBqT?tab=execution.
But if we request status of transaction with this code

let outcome_view = ctx
            .jsonrpc_client
            .call(RpcTransactionStatusRequest {
                transaction_info: TransactionInfo::TransactionId {
                    tx_hash,
                    sender_account_id: ctx.nodes.ctx().mpc_contract.id().clone(),
                },
                wait_until: near_primitives::views::TxExecutionStatus::Final,
            })
            .await?;

        let Some(outcome) = outcome_view.final_execution_outcome else {
            anyhow::bail!("final execution outcome not available");
        };

        let outcome = outcome.into_outcome();

        let FinalExecutionStatus::SuccessValue(payload) = outcome.status else {
            anyhow::bail!("tx finished unsuccessfully: {:?}", outcome.status);
        };

as we did in integration tests, if the sign_on_finish fn did panic, this will show outcome.status as Failure(ActionError(ActionError { index: Some(0), kind: FunctionCallError(ExecutionError("Smart contract panicked: signature request timed out")) })).

So I guess if yield_promise executes to fail, which will result in sign_on_finish fn do env::panic_str(), then we'd be able to find out the transaction status is failure, although not through explorer.

@ailisp
Copy link
Member

ailisp commented Jul 4, 2024

Nice Implementation!

I noticed in doc that:

If the contract does not trigger a resume after 200 blocks - around 4 minutes - the yielded function will resume receiving a "timeout error" as input.

Not sure if this need to be considered, if so combine the previous recursive call approach with yield can extend the limit to 400, 600, ... blocks with a bit more gas

@volovyks
Copy link
Collaborator

volovyks commented Jul 4, 2024

@ailisp that can be a good idea if we will create an infinite or long queue of signature requests. At the moment we are using different strategy where we allow max 8 simultaneous requests and return signature fast (<200blocks).

We may change this strategy in the future.

@ppca
Copy link
Contributor Author

ppca commented Jul 9, 2024

Nice Implementation!

I noticed in doc that:

If the contract does not trigger a resume after 200 blocks - around 4 minutes - the yielded function will resume receiving a "timeout error" as input.

Not sure if this need to be considered, if so combine the previous recursive call approach with yield can extend the limit to 400, 600, ... blocks with a bit more gas

@ailisp What Serhii said about limiting request count at the same time is right. I've personally load tested the contract in this code without the request_counter < 8 limit, and we do timeout a lot, I didn't keep precise count, but with the existing load test default setup, at least half are timing out.
I personally think extending the timeout window over 4 min would probably not be what we want users to experience. Our nodes currently function well with ~10 signature requests per min, beyond that it cannot generate triples fast enough to cover signatures. Ideally we improve our backend so we can function with more requests at the same time, but right now we might have to live with the request counter :(

@ppca ppca force-pushed the xiangyi/contract_yield_resume branch 3 times, most recently from eb666e1 to 5c9bbcb Compare July 10, 2024 00:01
@ppca ppca requested review from ailisp and volovyks July 10, 2024 00:03
@ppca ppca force-pushed the xiangyi/contract_yield_resume branch from 5c9bbcb to d083e32 Compare July 10, 2024 00:08
@ppca ppca force-pushed the xiangyi/contract_yield_resume branch from d083e32 to f2613ff Compare July 10, 2024 15:52
@ppca
Copy link
Contributor Author

ppca commented Jul 10, 2024

tx finished unsuccessfully: Failure(ActionError(ActionError { index: Some(0), kind: FunctionCallError(ExecutionError("Smart contract panicked: Signature for this payload already requested")) }))
-- I'll look into the failed integration test

…onded to return Error types instead of anyhow
@ppca
Copy link
Contributor Author

ppca commented Jul 12, 2024

tx finished unsuccessfully: Failure(ActionError(ActionError { index: Some(0), kind: FunctionCallError(ExecutionError("Smart contract panicked: Signature for this payload already requested")) })) -- I'll look into the failed integration test

Turns out it's because the sign_on_finish fn removes pending request, but then does env::panic_str if the callback_result is Err, in which case all prev state changes in sign_on_finish will be reverted, thus pending request was not removed.

Also refactored signaure_responded a bit so it's easier to tell status of sign tx.

@ChaoticTempest
Copy link
Member

ChaoticTempest commented Jul 13, 2024

Initial testing of this, I found that if the test fails and the signature is not provided in time at all, we just get "Smart contract panicked: Signature for this payload already requested". Which does not seem correct for an error message if it the MPC network never responded.

If you need the test code, I'll make a separate PR.
here's the execution details:

ExecutionFinalResult {
    total_gas_burnt: NearGas {
        inner: 2605429106161,
    },
    transaction: ExecutionOutcome {
        transaction_hash: 2rr3HUCqWPh9F9gt3grxR96mdrYpSazSJRCSWa89UdDk,
        block_hash: 63EEVkkJukUoJGqX1hKqcLPWmuRSyQ4pkcSu12aW3Jkk,
        logs: [],
        receipt_ids: [
            AAsdX4sEPrXRRaCB54946N3nYz6W66ecq9WiLnTguLtu,
        ],
        gas_burnt: NearGas {
            inner: 308444080648,
        },
        tokens_burnt: NearToken {
            inner: 30844408064800000000,
        },
        executor_id: AccountId(
            "dev-20240713010504-74071972890982.test.near",
        ),
        status: SuccessReceiptId(AAsdX4sEPrXRRaCB54946N3nYz6W66ecq9WiLnTguLtu),
    },
    receipts: [
        ExecutionOutcome {
            transaction_hash: AAsdX4sEPrXRRaCB54946N3nYz6W66ecq9WiLnTguLtu,
            block_hash: 63EEVkkJukUoJGqX1hKqcLPWmuRSyQ4pkcSu12aW3Jkk,
            logs: [],
            receipt_ids: [
                6sZJKXNtkC4wZVABmYsmTQv7fBhZ4XN1Ntnadw8G5tjn,
                7tG7dXEnc9WCCBkkgNgv7HX6FBGWyWb2VJxYEnZrDRTa,
            ],
            gas_burnt: NearGas {
                inner: 1850619900513,
            },
            tokens_burnt: NearToken {
                inner: 185061990051300000000,
            },
            executor_id: AccountId(
                "dev-20240713010504-74071972890982.test.near",
            ),
            status: Failure(ActionError(ActionError { index: Some(0), kind: FunctionCallError(ExecutionError("Smart contract panicked: Signature for this payload already requested")) })),
        },
        ExecutionOutcome {
            transaction_hash: 6sZJKXNtkC4wZVABmYsmTQv7fBhZ4XN1Ntnadw8G5tjn,
            block_hash: C1ej1k8wDtGn4ohLJGMv8c8UPres3eyX4QMGX73nJTkB,
            logs: [],
            receipt_ids: [],
            gas_burnt: NearGas {
                inner: 223182562500,
            },
            tokens_burnt: NearToken {
                inner: 0,
            },
            executor_id: AccountId(
                "dev-20240713010504-74071972890982.test.near",
            ),
            status: SuccessValue(''),
        },
        ExecutionOutcome {
            transaction_hash: 7tG7dXEnc9WCCBkkgNgv7HX6FBGWyWb2VJxYEnZrDRTa,
            block_hash: C1ej1k8wDtGn4ohLJGMv8c8UPres3eyX4QMGX73nJTkB,
            logs: [],
            receipt_ids: [],
            gas_burnt: NearGas {
                inner: 223182562500,
            },
            tokens_burnt: NearToken {
                inner: 0,
            },
            executor_id: AccountId(
                "dev-20240713010504-74071972890982.test.near",
            ),
            status: SuccessValue(''),
        },
    ],
    status: Failure(ActionError(ActionError { index: Some(0), kind: FunctionCallError(ExecutionError("Smart contract panicked: Signature for this payload already requested")) })),
}

@ChaoticTempest
Copy link
Member

Seems like if we submit the same request for a signature after it has already completed the first, it will error out with Smart contract panicked: Signature for this payload already requested

@ChaoticTempest
Copy link
Member

Seems like also getting "Signature has timed out" after multiple successive signature requests.

Take a look at all the testing I added in branch phuong/feat/contract-testing-without-nodes

@ppca
Copy link
Contributor Author

ppca commented Jul 15, 2024

Seems like also getting "Signature has timed out" after multiple successive signature requests.

Take a look at all the testing I added in branch phuong/feat/contract-testing-without-nodes

Hey @ChaoticTempest I've checked that code out and tried it. I've tried local unit tests myself too, but I did it by returning String as a signature. When I tested locally, I found out after many attempts that I needed to add in some wait time between sending async sign request and sending respond, likely because of transact_async() and yield/resume combined.

The test_contract_sign_request test in your branch passes 100% after I add in tokio::time::sleep(Duration::from_secs(2)).await; between transact_async of sign and respond.

I tried to reproduce the failed cases you mentioned:

  1. "Signature has timed out" will happen on the very fist sign request without the sleep, no other sign requests will be sent.

  2. I'm not able to reproduce if we submit the same request for a signature after it has already completed the first, it will error out with Smart contract panicked: Signature for this payload already requested locally. I used to see the error in the integration test test_signature_offline_node_back_online, and that happened previously because in prev code `sign_on_finish fn removes pending request, but then does env::panic_str if the callback_result is Err, in which case all prev state changes in sign_on_finish will be reverted, thus pending request was not removed. But it does not happen any more in the integration test, but it can still happen if we submit the same request before yield/resume returns a result or fail with timeout.

  3. I also did not manage to reproduce: I found that if the test fails and the signature is not provided in time at all, we just get "Smart contract panicked: Signature for this payload already requested". This likely is similar to the situation mentioned in the above paragraph, but shouldn't happen with the current code and adding in the sleep.

@ailisp
Copy link
Member

ailisp commented Jul 16, 2024

Did not have a chance to locally test it, but the code looks good to me overall. Nice work!

ChaoticTempest
ChaoticTempest previously approved these changes Jul 16, 2024
Copy link
Member

@ChaoticTempest ChaoticTempest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's weird that we require the sleep between the sign and respond, but we can resolve it later in a future PR

"clear_state_on_finish",
&serde_json::to_vec(&(index,)).unwrap(),
CLEAR_STATE_ON_FINISH_CALL_GAS,
GasWeight(0),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is gas weight supposed to be zero here. The docs mention the default is GasWeight(1). Tested and changed it to 1, but doesn't seem to effect much

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirming with Saketh

@ppca ppca force-pushed the xiangyi/contract_yield_resume branch from d79de76 to 9493b27 Compare July 16, 2024 21:07
@ppca ppca requested a review from ChaoticTempest July 16, 2024 21:11
ChaoticTempest
ChaoticTempest previously approved these changes Jul 16, 2024
@ChaoticTempest ChaoticTempest merged commit 55d44e3 into develop Jul 16, 2024
4 checks passed
@ChaoticTempest ChaoticTempest deleted the xiangyi/contract_yield_resume branch July 16, 2024 23:01
Copy link

Terraform Feature Environment Destroy (dev-628)

Terraform Initialization ⚙️success

Terraform Destroy success

Show Destroy Plan


No changes. No objects need to be destroyed.

Either you have not created any objects yet or the existing objects were
already deleted outside of Terraform.

Destroy complete! Resources: 0 destroyed.

Pusher: @ChaoticTempest, Action: pull_request, Working Directory: ``, Workflow: Terraform Feature Env (Destroy)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

modify contract with yield/resume
4 participants