Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coro::task_container gc fix not completing coroutines #288

Merged
merged 2 commits into from
Jan 30, 2025

Conversation

jbaldwin
Copy link
Owner

@jbaldwin jbaldwin commented Jan 29, 2025

The coro::task_container::gc_internal function was deleting coroutines when marked as .done(), however there is some mechanism that rarely would cause the user_task coroutine to not actually execute. I'm still not sure exactly why this is the case but:

  1. Simply disabling gc_internal() would stop the problem
  2. Running gc_internal() and moving the coro::task to a 'dead' list
    still caused the issue.

With these in mind I spent time re-reading the specification on the final_suspend and determined that coro::task_container should be a thing atomic counter to track the submitted coroutines and have them self delete. The self deletion is now done via a
coro::detail::task_self_destroying coroutine type that takes advantage of the promise's final_suspend() not suspending. The spec states that if this doesn't suspend then the coroutine will call destroy() on itself.

Closes #287

@jbaldwin jbaldwin self-assigned this Jan 29, 2025
@jbaldwin jbaldwin force-pushed the issue-287/task_container_gc_fix branch 3 times, most recently from cbce67c to 2e8e40f Compare January 29, 2025 19:30
@jbaldwin jbaldwin mentioned this pull request Jan 29, 2025
@jbaldwin
Copy link
Owner Author

We can safely ignore the false positive on codacy, the variable very much is used.

@jbaldwin jbaldwin force-pushed the issue-287/task_container_gc_fix branch from 2e8e40f to 49829ea Compare January 29, 2025 19:54
The coro::task_container::gc_internal function was deleting coroutines
when marked as .done(), however there is some mechanism that rarely
would cause the user_task coroutine to not actually execute. I'm still
not sure exactly why this is the case but:

1) Simply disabling gc_internal() would stop the problem
2) Running gc_internal() and moving the coro::task to a 'dead' list
   still caused the issue.

With these in mind I spent time re-reading the specification on the
final_suspend and determined that coro::task_container should be a thing
atomic counter to track the submitted coroutines and have them self
delete. The self deletion is now done via a
coro::detail::task_self_destroying coroutine type that takes advantage
of the promise's final_suspend() not suspending. The spec states that if
this doesn't suspend then the coroutine will call destroy() on itself.

Closes #287
@jbaldwin jbaldwin force-pushed the issue-287/task_container_gc_fix branch from 49829ea to 0ba7020 Compare January 29, 2025 19:56
@jbaldwin jbaldwin added the bug Something isn't working label Jan 29, 2025
Copy link

@breakds breakds left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this approach the logic is actually a lot simpler. Thanks!

@jbaldwin
Copy link
Owner Author

I renamed task_self_destroying to task_self_deleting since its more inline with the delete keyword.

@jbaldwin jbaldwin merged commit 49333e7 into main Jan 30, 2025
37 checks passed
@jbaldwin jbaldwin deleted the issue-287/task_container_gc_fix branch January 30, 2025 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

schedule stress test
2 participants