-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding initiator validation for optimised and inaccessible paths #4468
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: manasagowri <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @manasagowri ,
Thanks for adding up new validation at initiator level. There are few comments on the code changes.
Lets discuss more over these comments.
@@ -692,6 +692,10 @@ def failover(self, gateway, fail_tool): | |||
LOG.info( | |||
f"{list(active[0])} is new and only Active GW for failed {hostname}" | |||
) | |||
active_gw = list(active[0]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Firstly, lets add client namespace validation in
ha.run
. Lets not touch main definition. - Return back the active gateway node which is incharge of failed over GW in dict.
@@ -896,6 +905,26 @@ def validate_incremetal_io(write_samples): | |||
|
|||
LOG.info("IO Validation is Successfull on all RBD images..") | |||
|
|||
def validate_initiator(self, gateway, namespaces, ana_id): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets call this from ha.run()
fail_gw_ana_ids.append(gw.ana_group_id) | ||
ns = self.fetch_namespaces(gw, [gw.ana_group_id]) | ||
namespaces.extend(ns) | ||
self.validate_initiator(gw, ns, gw.ana_group_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In validate_initiator,
- At Initial stage, pass active gateway node object(which has to be failed over next).
- Fetch namesapces for this GW from gateway and client side.
- Validate optimized path(i.e, basically Active gateway which is passed).
- At Failover stage, collect the active gateway from failover methods.
- Use this Active GW against namespaces found in Step 2 to validate Optimized path on client is set to new active GW.
- At failback condition, collect the failed-over gateway.
- Repeat step 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is exactly the validation I have already added in the validate_initiator and ha.run methods
empty list if the gateway is inaccessible for all devices | ||
""" | ||
out, _ = self.node.exec_command( | ||
cmd="nvme list --output-format json", sudo=True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already part of nvme initiator lib, if not add this lib method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw the lib method now. Will make changes accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might not be required when we go with the uuid approach as discussed
out, _ = self.node.exec_command( | ||
cmd="nvme list --output-format json", sudo=True | ||
) | ||
nvme_devices = json.loads(out).get("Devices", []) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At client side,
- Need to identify the right devices against the namespace passed from GW side(probably using lsblk -o name,wwn).
- Get the topology for that device(i.e get the path status for each GW path).
- Validate against the GW which has optimized path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvme list-subsys gives the namespace id and subsystem name, isn't this enough proof to get the required namespaces?
subsystems = json.loads(out)[0].get("Subsystems") | ||
for subsys in subsystems: | ||
for path in subsys.get("Paths"): | ||
if gw_ip in path.get("Address") and path.get("State") == "live" and path.get("ANAState") == "optimized": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create separate method to list out optimized and inaccessible paths for a device.
{"device_wwn": {"optimized": []}, {"inaccessible": []}
. use it for validation of paths
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do this change as well.
Adding initiator side validation for NVMe HA to verify optimised and inaccessible paths.
Logic written:
Logs - http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-LE1K03/