Adding initiator validation for optimised and inaccessible paths #4468

manasagowri · 2025-02-22T08:19:37Z

Adding initiator side validation for NVMe HA to verify optimised and inaccessible paths.

Logic written:

Fetch namespaces being served by the gateway to be failed in failover
Fetch corresponding devices from initiator and verify that the given gateway is optimised for them before failover
Fetch the gateway serving the above namespaces post failover
Fetch corresponding devices from initiator and verify that the new gateway is optimised for them after failover
Fetch corresponding devices from initiator and verify that the original gateway is optimised for them after failback

Logs - http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-LE1K03/

Signed-off-by: manasagowri <[email protected]>

sunilkumarn417

Hi @manasagowri ,
Thanks for adding up new validation at initiator level. There are few comments on the code changes.
Lets discuss more over these comments.

sunilkumarn417 · 2025-02-24T17:02:46Z

tests/nvmeof/workflows/ha.py

@@ -692,6 +692,10 @@ def failover(self, gateway, fail_tool):
                    LOG.info(
                        f"{list(active[0])} is new and only Active GW for failed {hostname}"
                    )
+                    active_gw = list(active[0])


Firstly, lets add client namespace validation in ha.run. Lets not touch main definition.

Return back the active gateway node which is incharge of failed over GW in dict.

sunilkumarn417 · 2025-02-24T17:04:04Z

tests/nvmeof/workflows/ha.py

@@ -896,6 +905,26 @@ def validate_incremetal_io(write_samples):

        LOG.info("IO Validation is Successfull on all RBD images..")

+    def validate_initiator(self, gateway, namespaces, ana_id):


lets call this from ha.run()

sunilkumarn417 · 2025-02-24T17:11:47Z

tests/nvmeof/workflows/ha.py

+                        fail_gw_ana_ids.append(gw.ana_group_id)
+                        ns = self.fetch_namespaces(gw, [gw.ana_group_id])
+                        namespaces.extend(ns)
+                        self.validate_initiator(gw, ns, gw.ana_group_id)


In validate_initiator,

At Initial stage, pass active gateway node object(which has to be failed over next).

Fetch namesapces for this GW from gateway and client side.

Validate optimized path(i.e, basically Active gateway which is passed).

At Failover stage, collect the active gateway from failover methods.

Use this Active GW against namespaces found in Step 2 to validate Optimized path on client is set to new active GW.

At failback condition, collect the failed-over gateway.

Repeat step 2.

This is exactly the validation I have already added in the validate_initiator and ha.run methods

sunilkumarn417 · 2025-02-24T17:12:29Z

tests/nvmeof/workflows/initiator.py

+            empty list if the gateway is inaccessible for all devices
+        """
+        out, _ = self.node.exec_command(
+            cmd="nvme list --output-format json", sudo=True


This is already part of nvme initiator lib, if not add this lib method.

I saw the lib method now. Will make changes accordingly.

This might not be required when we go with the uuid approach as discussed

sunilkumarn417 · 2025-02-24T17:15:38Z

tests/nvmeof/workflows/initiator.py

+        out, _ = self.node.exec_command(
+            cmd="nvme list --output-format json", sudo=True
+        )
+        nvme_devices = json.loads(out).get("Devices", [])


At client side,

Need to identify the right devices against the namespace passed from GW side(probably using lsblk -o name,wwn).

Get the topology for that device(i.e get the path status for each GW path).

Validate against the GW which has optimized path.

nvme list-subsys gives the namespace id and subsystem name, isn't this enough proof to get the required namespaces?

sunilkumarn417 · 2025-02-24T17:17:23Z

tests/nvmeof/workflows/initiator.py

+            subsystems = json.loads(out)[0].get("Subsystems")
+            for subsys in subsystems:
+                for path in subsys.get("Paths"):
+                    if gw_ip in path.get("Address") and path.get("State") == "live" and path.get("ANAState") == "optimized":


create separate method to list out optimized and inaccessible paths for a device.
{"device_wwn": {"optimized": []}, {"inaccessible": []}. use it for validation of paths

will do this change as well.

Adding initiator validation for optimised and inaccessible paths

c90589a

Signed-off-by: manasagowri <[email protected]>

manasagowri requested review from Manohar-Murthy and HaruChebrolu as code owners February 22, 2025 08:19

manasagowri requested a review from sunilkumarn417 February 22, 2025 08:19

manasagowri marked this pull request as draft February 22, 2025 08:19

sunilkumarn417 reviewed Feb 24, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding initiator validation for optimised and inaccessible paths #4468

Adding initiator validation for optimised and inaccessible paths #4468

manasagowri commented Feb 22, 2025 •

edited

Loading

sunilkumarn417 left a comment

sunilkumarn417 Feb 24, 2025

sunilkumarn417 Feb 24, 2025

sunilkumarn417 Feb 24, 2025

manasagowri Feb 26, 2025

sunilkumarn417 Feb 24, 2025

manasagowri Feb 25, 2025

manasagowri Feb 26, 2025

sunilkumarn417 Feb 24, 2025

manasagowri Feb 25, 2025

sunilkumarn417 Feb 24, 2025

manasagowri Feb 25, 2025

		@@ -896,6 +905,26 @@ def validate_incremetal_io(write_samples):

		LOG.info("IO Validation is Successfull on all RBD images..")

		def validate_initiator(self, gateway, namespaces, ana_id):

Adding initiator validation for optimised and inaccessible paths #4468

Are you sure you want to change the base?

Adding initiator validation for optimised and inaccessible paths #4468

Conversation

manasagowri commented Feb 22, 2025 • edited Loading

sunilkumarn417 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

manasagowri commented Feb 22, 2025 •

edited

Loading