Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Merge session connection deadlock branch to main #1919

Merged
merged 11 commits into from
Mar 15, 2022

Conversation

irenarindos
Copy link
Collaborator

@irenarindos irenarindos commented Mar 15, 2022

The deadlock reported in #1812 was caused by contention between processes for session connection closure. The same connection was closed in more than one place, and connections were closed across several sessions. For multiple connection closures, multiple sessions could be updated in the same transaction. Additionally, database triggers for data warehousing would update session connection and session warehouse tables in the same transaction.

The changes in this pull request cover moving session connections into a separate repository from sessions. Transactions for connections and sessions are handled separately to eliminate the resource contention causing a deadlock. Domain services have also been added to session package to implement functionality that requires accessing both the session and session connection repositories.

Fixes #1812

irenarindos and others added 8 commits March 15, 2022 12:11
The query for CloseDeadConnectionsForWorker includes a time comparison
to exclude results that are too close to the current time. This is done
to account for a delay in reporting by the worker. However, since the
delay was hardcoded in the query, it made it difficult to test.

This moves the delay duration into a parameter that is set on the
repository. The amount of the delay can be set at repository creation
time using the new WithWorkerStateDelay option. This defaults to the
previously hardcoded amount.

This will allow tests to control the delay to exercise this
functionality fully.
This adds tests for the Status method of the WorkerService grpc service
that is exposed by controller.
@irenarindos irenarindos force-pushed the llb-session-connection-deadlocks branch from f5ec798 to 90a5367 Compare March 15, 2022 16:12
tmessi and others added 3 commits March 15, 2022 13:51
This creates a domain service function `session.WorkerStatusReport`. The
service performs two functions:

1. Checks the status for all of the reported sessions, returning a
   report of the sessions that are in a "non-active" state, i.e.
   "cacneling" or "terminated".
2. Looks for any "orphaned" session connections, i.e. connections that
   are active but were not reported by the worker. These connections are
   marked as closed.

This also removes the "ShouldCloseConnectionsOnWorker" logic, as this
was determined to not be a possible state, and was unnecessary.
This removes or reduces the time that is waited in a number of tests by
allowing the grace period time be a configuration option on the
connection repository.
@tmessi tmessi force-pushed the llb-session-connection-deadlocks branch from 90a5367 to bbc12d4 Compare March 15, 2022 17:51
@irenarindos irenarindos merged commit 066e478 into main Mar 15, 2022
@irenarindos irenarindos deleted the llb-session-connection-deadlocks branch March 15, 2022 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DEADLOCK on warehouse tables
2 participants