Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util: exclude empty label values for crushlocation map #4710

Merged

Conversation

iPraveenParihar
Copy link
Contributor

@iPraveenParihar iPraveenParihar commented Jul 15, 2024

Describe what this PR does

This commit resolves a bug where node labels with empty values
are processed for the crush_location mount option, leading to
invalid mount options and subsequent mount failures.

Issue:

If Node is labelled with empty value then mount fails -
(mount option for read affinity is set as read_from_replica=localize,crush_location=zone:|host:c1] which is invalid)

I0711 10:54:22.595638 3652833 utils.go:198] ID: 61 Req-ID: 0001-0009-rook-ceph-0000000000000002-a1347faa-82f3-4dbe-81e1-e067e3f769c1 GRPC call: /csi.v1.Node/NodeStageVolume
I0711 10:54:22.595919 3652833 utils.go:199] ID: 61 Req-ID: 0001-0009-rook-ceph-0000000000000002-a1347faa-82f3-4dbe-81e1-e067e3f769c1 GRPC request: {"secrets":"***stripped***","staging_target_path":"/var/lib/kubelet/plugins/kubernetes.io/csi/rook-ceph.rbd.csi.ceph.com/8573304e49ec1ecc6f5f01da6306387ccb7bbf9e866fc4ec959ab7f9190a5a19/globalmount","volume_capability":{"AccessType":{"Mount":{"fs_type":"ext4"}},"access_mode":{"mode":7}},"volume_context":{"clusterID":"rook-ceph","encryptionKMSID":"azure-test","imageFeatures":"layering","imageFormat":"2","imageName":"csi-vol-a1347faa-82f3-4dbe-81e1-e067e3f769c1","journalPool":"replicapool","pool":"replicapool","storage.kubernetes.io/csiProvisionerIdentity":"1720694375759-9475-rook-ceph.rbd.csi.ceph.com"},"volume_id":"0001-0009-rook-ceph-0000000000000002-a1347faa-82f3-4dbe-81e1-e067e3f769c1"}
I0711 10:54:22.605770 3652833 omap.go:89] ID: 61 Req-ID: 0001-0009-rook-ceph-0000000000000002-a1347faa-82f3-4dbe-81e1-e067e3f769c1 got omap values: (pool="replicapool", namespace="", name="csi.volume.a1347faa-82f3-4dbe-81e1-e067e3f769c1"): map[csi.imageid:4bfdd951ce312 csi.imagename:csi-vol-a1347faa-82f3-4dbe-81e1-e067e3f769c1 csi.volname:pvc-8a3b5702-d583-429e-8e9d-e4773dc35519 csi.volume.owner:test]
I0711 10:54:22.693064 3652833 rbd_util.go:352] ID: 61 Req-ID: 0001-0009-rook-ceph-0000000000000002-a1347faa-82f3-4dbe-81e1-e067e3f769c1 checking for ImageFeatures: [layering]
I0711 10:54:22.693370 3652833 crushlocation.go:41] CRUSH location labels passed for processing: [topology.io/zone topology.io/host]
I0711 10:54:22.693399 3652833 crushlocation.go:69] list of CRUSH location processed: map[host:c1 zone:]
I0711 10:54:22.747556 3652833 cephcmds.go:105] ID: 61 Req-ID: 0001-0009-rook-ceph-0000000000000002-a1347faa-82f3-4dbe-81e1-e067e3f769c1 command succeeded: rbd [device list --format=json --device-type krbd]
I0711 10:54:22.795839 3652833 rbd_attach.go:437] ID: 61 Req-ID: 0001-0009-rook-ceph-0000000000000002-a1347faa-82f3-4dbe-81e1-e067e3f769c1 rbd: map mon 10.110.151.235:6789
I0711 10:54:22.973494 3652833 cephcmds.go:98] ID: 61 Req-ID: 0001-0009-rook-ceph-0000000000000002-a1347faa-82f3-4dbe-81e1-e067e3f769c1 an error (exit status 22) occurred while running rbd args: [--id csi-rbd-node -m 10.110.151.235:6789 --keyfile=***stripped*** map replicapool/csi-vol-a1347faa-82f3-4dbe-81e1-e067e3f769c1 --device-type krbd --options noudev --options read_from_replica=localize,crush_location=zone:|host:c1]
W0711 10:54:22.973535 3652833 rbd_attach.go:486] ID: 61 Req-ID: 0001-0009-rook-ceph-0000000000000002-a1347faa-82f3-4dbe-81e1-e067e3f769c1 rbd: map error an error (exit status 22) occurred while running rbd args: [--id csi-rbd-node -m 10.110.151.235:6789 --keyfile=***stripped*** map replicapool/csi-vol-a1347faa-82f3-4dbe-81e1-e067e3f769c1 --device-type krbd --options noudev --options read_from_replica=localize,crush_location=zone:|host:c1], rbd output: rbd: sysfs write failed
rbd: map failed: (22) Invalid argument
E0711 10:54:22.973810 3652833 utils.go:203] ID: 61 Req-ID: 0001-0009-rook-ceph-0000000000000002-a1347faa-82f3-4dbe-81e1-e067e3f769c1 GRPC error: rpc error: code = Internal desc = rbd: map failed with error an error (exit status 22) occurred while running rbd args: [--id csi-rbd-node -m 10.110.151.235:6789 --keyfile=***stripped*** map replicapool/csi-vol-a1347faa-82f3-4dbe-81e1-e067e3f769c1 --device-type krbd --options noudev --options read_from_replica=localize,crush_location=zone:|host:c1], rbd error output: rbd: sysfs write failed
rbd: map failed: (22) Invalid argument

Solution:

don't consider node labels with empty value for crush_location.

Checklist:

  • Commit Message Formatting: Commit titles and messages follow guidelines in the developer guide.
  • Reviewed the developer guide on Submitting a Pull Request
  • Pending release notes updated with breaking and/or notable changes for the next major release.
  • Documentation has been updated, if necessary.
  • Unit tests have been added, if necessary.
  • Integration tests have been added, if necessary.

Show available bot commands

These commands are normally not required, but in case of issues, leave any of
the following bot commands in an otherwise empty comment in this PR:

  • /retest ci/centos/<job-name>: retest the <job-name> after unrelated
    failure (please report the failure too!)

@iPraveenParihar iPraveenParihar self-assigned this Jul 15, 2024
@iPraveenParihar iPraveenParihar added the bug Something isn't working label Jul 15, 2024
@iPraveenParihar iPraveenParihar force-pushed the fix/empty-label-value-for-crushlocation branch from 5fc82d2 to ffc0f09 Compare July 15, 2024 06:45
@iPraveenParihar iPraveenParihar marked this pull request as ready for review July 15, 2024 07:38
@iPraveenParihar iPraveenParihar requested a review from Madhu-1 July 15, 2024 07:38
@Madhu-1 Madhu-1 added the backport-to-release-v3.11 Label to backport from devel to release-v3.11 branch label Jul 15, 2024
@Madhu-1 Madhu-1 requested a review from a team July 15, 2024 11:56
@nixpanic
Copy link
Member

@Mergifyio queue

Copy link
Contributor

mergify bot commented Jul 16, 2024

queue

🛑 The pull request has been removed from the queue default

The queue conditions cannot be satisfied due to failing checks.

You can take a look at Queue: Embarked in merge queue check runs for more details.

In case of a failure due to a flaky test, you should first retrigger the CI.
Then, re-embark the pull request into the merge queue by posting the comment
@mergifyio refresh on the pull request.

This commit resolves a bug where node labels with empty values
are processed for the crush_location mount option,
leading to invalid mount options and subsequent mount failures.

Signed-off-by: Praveen M <[email protected]>
@iPraveenParihar iPraveenParihar force-pushed the fix/empty-label-value-for-crushlocation branch from ffc0f09 to 493875f Compare July 16, 2024 07:16
@mergify mergify bot added the ok-to-test Label to trigger E2E tests label Jul 16, 2024
@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.30

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/upgrade-tests-cephfs

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.30

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.29

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/upgrade-tests-rbd

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.27

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.30

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/k8s-e2e-external-storage/1.28

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.29

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.27

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.29

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e-helm/k8s-1.28

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.27

@ceph-csi-bot
Copy link
Collaborator

/test ci/centos/mini-e2e/k8s-1.28

@ceph-csi-bot ceph-csi-bot removed the ok-to-test Label to trigger E2E tests label Jul 16, 2024
@nixpanic
Copy link
Member

/retest ci/centos/mini-e2e/k8s-1.27

@nixpanic
Copy link
Member

@Mergifyio requeue

ci/centos/mini-e2e/k8s-1.27 failed to clone the repository from github

Copy link
Contributor

mergify bot commented Jul 16, 2024

requeue

✅ The queue state of this pull request has been cleaned. It can be re-embarked automatically

@mergify mergify bot merged commit f11fa81 into ceph:devel Jul 16, 2024
40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-to-release-v3.11 Label to backport from devel to release-v3.11 branch bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants