[Ideas] Should we support ON CONFLICT when updating the distribution keys? #902
Unanswered
my-ship-it
asked this question in
Ideas / Feature Requests
Replies: 1 comment 1 reply
-
Good! |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Description
Hackers,
Currently, when the ON CONFLICT statement involves distributed keys, an error occurs, as shown below:
However, it can work if it does not include distribution keys
This is because we did some strict checks in the function sanity_check_on_conflict_update, requiring that the columns updated on conflict cannot include distribution keys.
At least, we could do some simple optimizations and more detailed checks, such as if the distribution key is not updated,
error won't be thrown out.
Furthermore, if there is a modification of the distribution key, we can borrow the way of SplitUpdate, introduce a new executor OnConflictSplitUpdate, and convert the UPSERT operation to INSERT or DELETE + INSERT according to the index check result. If the OnConflictSplitUpdate node checks the index and finds a conflict, it will generate two operations, DELETE + INSERT, just like the SplitUpdate node, otherwise only INSERT will be generated. The final update is completed by Motion to the nodeModifyTable node (writer Gang) of the upper layer
Additional Motion will introduce costs, and we don't need to generate OnConflictSplit node every time, such as when the distribution key is not updated.
We need to implement in both GPORCA and legacy planner, but could implement in legacy planner first.
Of course, like the SplitUpdate type, not all queries can support this method, and we need to do some detailed filtering and processing.
Any ideas are welcome.
Use case/motivation
No response
Related issues
No response
Are you willing to submit a PR?
Beta Was this translation helpful? Give feedback.
All reactions