Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[e2e failure] backend images not matching kubernetes resource count,image count 1 kubernetes resource count 0 #5166

Open
Madhu-1 opened this issue Feb 20, 2025 · 4 comments

Comments

@Madhu-1
Copy link
Collaborator

Madhu-1 commented Feb 20, 2025

�[38;5;9m• [FAILED] [107.867 seconds]�[0m
�[0mRBD �[38;5;243mTest RBD CSI �[38;5;9m�[1m[It] Test RBD CSI�[0m
�[38;5;243m/go/src/github.com/ceph/ceph-csi/e2e/rbd.go:438�[0m

  �[38;5;9m[FAILED] backend images not matching kubernetes resource count,image count 1 kubernetes resource count 0
  backend image Info:
   [csi-vol-47dd54b3-94f8-4647-9e2a-196304d5302b]
   images information and status Pool: replicapool, Image: csi-vol-47dd54b3-94f8-4647-9e2a-196304d5302b, Info: {"name":"csi-vol-47dd54b3-94f8-4647-9e2a-196304d5302b","id":"16eb56e3021b","size":1073741824,"objects":256,"order":22,"object_size":4194304,"snapshot_count":0,"block_name_prefix":"rbd_data.16eb56e3021b","format":2,"features":["layering"],"op_features":[],"flags":[],"create_timestamp":"Thu Feb 20 07:12:17 2025","access_timestamp":"Thu Feb 20 07:12:17 2025","modify_timestamp":"Thu Feb 20 07:12:17 2025"}
  , Status: {"watchers":[]}
  �[0m
  �[38;5;9mIn �[1m[It]�[0m�[38;5;9m at: �[1m/go/src/github.com/ceph/ceph-csi/e2e/rbd.go:207�[0m �[38;5;243m@ 02/20/25 07:12:30.268�[0m
�[38;5;243m------------------------------�[0m

�[38;5;9m�[1mSummarizing 1 Failure:�[0m
  �[38;5;9m[FAIL]�[0m �[0mRBD �[38;5;243mTest RBD CSI �[38;5;9m�[1m[It] Test RBD CSI�[0m
  �[38;5;243m/go/src/github.com/ceph/ceph-csi/e2e/rbd.go:207�[0m
I0220 07:12:29.288511   84374 cephcmds.go:106] ID: 47 Req-ID: 0001-0024-c1b8e53f-91e1-4add-b0f3-627e1af1bd81-0000000000000004-47dd54b3-94f8-4647-9e2a-196304d5302b command succeeded: rbd [unmap replicapool/csi-vol-47dd54b3-94f8-4647-9e2a-196304d5302b --device-type krbd --options noudev]
  I0220 07:12:29.288537   84374 nodeserver.go:1072] ID: 47 Req-ID: 0001-0024-c1b8e53f-91e1-4add-b0f3-627e1af1bd81-0000000000000004-47dd54b3-94f8-4647-9e2a-196304d5302b successfully unmapped volume (0001-0024-c1b8e53f-91e1-4add-b0f3-627e1af1bd81-0000000000000004-47dd54b3-94f8-4647-9e2a-196304d5302b)
 I0220 07:12:29.827722       1 utils.go:266] ID: 37 Req-ID: 0001-0024-c1b8e53f-91e1-4add-b0f3-627e1af1bd81-0000000000000004-47dd54b3-94f8-4647-9e2a-196304d5302b GRPC call: /csi.v1.Controller/DeleteVolume
  I0220 07:12:29.828511       1 utils.go:267] ID: 37 Req-ID: 0001-0024-c1b8e53f-91e1-4add-b0f3-627e1af1bd81-0000000000000004-47dd54b3-94f8-4647-9e2a-196304d5302b GRPC request: {"secrets":"***stripped***","volume_id":"0001-0024-c1b8e53f-91e1-4add-b0f3-627e1af1bd81-0000000000000004-47dd54b3-94f8-4647-9e2a-196304d5302b"}
  I0220 07:12:29.830176       1 omap.go:89] ID: 37 Req-ID: 0001-0024-c1b8e53f-91e1-4add-b0f3-627e1af1bd81-0000000000000004-47dd54b3-94f8-4647-9e2a-196304d5302b got omap values: (pool="replicapool", namespace="", name="csi.volume.47dd54b3-94f8-4647-9e2a-196304d5302b"): map[csi.imageid:16eb56e3021b csi.imagename:csi-vol-47dd54b3-94f8-4647-9e2a-196304d5302b csi.volname:pvc-5f30b48e-44a5-4f88-be6d-a5f9a6b08299 csi.volume.owner:rbd-7162]
  E0220 07:12:29.891118       1 controllerserver.go:1068] ID: 37 Req-ID: 0001-0024-c1b8e53f-91e1-4add-b0f3-627e1af1bd81-0000000000000004-47dd54b3-94f8-4647-9e2a-196304d5302b rbd replicapool/csi-vol-47dd54b3-94f8-4647-9e2a-196304d5302b is still being used
  E0220 07:12:29.891190       1 utils.go:271] ID: 37 Req-ID: 0001-0024-c1b8e53f-91e1-4add-b0f3-627e1af1bd81-0000000000000004-47dd54b3-94f8-4647-9e2a-196304d5302b GRPC error: rpc error: code = Internal desc = rbd csi-vol-47dd54b3-94f8-4647-9e2a-196304d5302b is still being used

https://jenkins-ceph-csi.apps.ocp.cloud.ci.centos.org/blue/rest/organizations/jenkins/pipelines/mini-e2e_k8s-1.31/runs/389/nodes/94/steps/97/log/?start=0

@Madhu-1 Madhu-1 changed the title [e23 failure] backend images not matching kubernetes resource count,image count 1 kubernetes resource count 0 [e2e failure] backend images not matching kubernetes resource count,image count 1 kubernetes resource count 0 Feb 20, 2025
@nixpanic
Copy link
Member

If this happens, the gRPC code Internal is probably not right. It would be cleaner to report Aborted so that there is a retry initiated from the CO.

See https://github.com/container-storage-interface/spec/blob/master/spec.md#error-scheme for the common errors.

nixpanic added a commit to nixpanic/ceph-csi that referenced this issue Feb 20, 2025
According to the error scheme documented in the CSI specification, the
Aborted error code should be initiate retries, whereas the Internal
error code does not require this behaviour.

When an RBD-image is still in-use, it can not be removed. The
DeleteVolume procedure should be retried and will succeed once the
RBD-image is not in-use anymore.

Fixes: ceph#5166
Signed-off-by: Niels de Vos <[email protected]>
@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Feb 20, 2025

@nixpanic it retries for all the error cases, it wont depends on any error codes or message. As user expect it to get deleted

@mergify mergify bot closed this as completed in 43b150f Feb 24, 2025
@Madhu-1 Madhu-1 reopened this Feb 24, 2025
@Madhu-1
Copy link
Collaborator Author

Madhu-1 commented Feb 24, 2025

Reopening as the issue is not fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants