-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable idempotency for Snapshot create request #1883
Comments
@Yuggupta27 I am yet to look at the logs in detail, however in summary it looks like a bug in external snapshotter sidecar. Isnt it ( or did I miss anything here?) ? if yes, can you please open an issue in https://github.com/kubernetes-csi/external-snapshotter/issues/ ? please let me know if you need help. |
@humblec we got confirmation in kubernetes slack channel. this is not a bug in the external snapshotter this is a bug in cephcsi as cephcsi should not return the new snapshotID when it gets the second request with the same name. |
@Madhu-1 and I had a discussion with the Kubernetes team in their slack, where they suggested that If CSI driver receives the same snapshot request multiple times, it should create only one snapshot ID and not multiple; to handle idempotency. |
This is applicable for both cephfs and RBD snapshot operations. |
Ack! Like the same volume name generates the same volume ID/handle the same snapshot name should generate the same snapshot ID. This is applicable to both providers as well as stated above. Will watch out for the PR, curious to know what we missed, but will look at the PR for the same. |
@Yuggupta27 if you have logs please attach the complete logs for further analysis. Looking for a volunteer to take it on priority |
Sure Madhu, attached the csi-provisioner and rbd-plugin logs in the issue... Let me know If I can provide any other logs for the same. |
I might be wrong but I think this problem should be only with RBD and not with CephFS: RBD:
CephFS: It takes snap ID as an argument and pass the same to go-ceph:
|
But if my assumption is correct we should be seeing this issue more frequently. |
Please ignore my last comments, shouldn't be the reason for this issue. |
Summarising findings & discussions. From the external snapshotter /k8s perspective:
If something goes wrong after initial Current RBD csi driver behavior:
Things that happen currently
Possible solutions and consequences:Soln 1:
Soln 2:
Soln 3:
My preference would be to have
|
True, imo, the right solution would be we ( CSI driver) return the proper Snapshot ID/Handle, once we have it properly recorded in the storage system. Cascading a new volume handle in between after returning another one to controller is not the correct approach. in short, we should return the handle once we are having it and certified that , we are good. |
I think that Soln 2 is the most appropriate. There might be a twist/enhancement possible, in case flattening is needed. Can the snapshot-id be returned, with |
@nixpanic, The problem is when there's a failure after such a response Currently we are just cleaning up, creating a new snapshot-id(which is never updated by the external snapshotter) and taking another snapshot on a PVC with possibly a running application using it. So if we do this
If there is no failure, we do just set |
As discussed we have to continue the analysis and get into the completion in upcoming releases. Moving this out of 3.4.0 release. |
This commit fixes snapshot id idempotency issue by always returning an error when flattening is in progress and not using `readyToUse:false` response. Updates: ceph#1883 Signed-off-by: Rakshith R <[email protected]>
This commit fixes snapshot id idempotency issue by always returning an error when flattening is in progress and not using `readyToUse:false` response. Updates: ceph#1883 Signed-off-by: Rakshith R <[email protected]>
Describe the bug
On multiple snapshot creation requests, if a request fails for some reason, a new snapshot ID is generated.
In case the latter request succeeds, the external-snapshotter updates the Ready-to-use as TRUE for the old
snapshotHandle and not for the new one.
Environment details
v1.19.2
v14.2.10
Logs
csi-provisioner
csi-rbdplugin
The text was updated successfully, but these errors were encountered: