Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add static and runtime dag info, API to fetch ancestor and successor tasks #2124

Merged
merged 42 commits into from
Feb 20, 2025

Conversation

talsperre
Copy link
Collaborator

@talsperre talsperre commented Oct 31, 2024

Add runtime DAG info so that we can query the ancestor and successor tasks for a given task easily.

Usage

from metaflow import Task, namespace
namespace(None)
task = Task('RuntimeDAGFlow/18/step_c/32076012', attempt=0)

To get ancestors, progenies, and siblings, use the following API:

ancestors = task.ancestors
successors = task.successors

The output would be a list of metaflow Task objects.

@talsperre talsperre force-pushed the dev/add-runtime-dag-info branch from 48c771d to ec43f14 Compare November 1, 2024 18:34
@talsperre talsperre force-pushed the dev/add-runtime-dag-info branch from ffbf68a to c6fb9ac Compare January 2, 2025 23:25
@talsperre talsperre changed the title Add static and runtime dag info, API to fetch ancestor tasks Add static and runtime dag info, API to fetch ancestor and successor tasks Jan 7, 2025
@talsperre talsperre force-pushed the dev/add-runtime-dag-info branch 2 times, most recently from d66d32b to 7644058 Compare January 12, 2025 03:12
Copy link
Contributor

@romain-intel romain-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments. I think it's pretty close though. I haven't looked at hte metadata service changes. We may also want to raise a better error message if the service is not new enough?

@talsperre talsperre force-pushed the dev/add-runtime-dag-info branch from 17a4489 to 7cdfb41 Compare January 15, 2025 00:53
@talsperre talsperre force-pushed the dev/add-runtime-dag-info branch from a8df33d to 7833e40 Compare January 22, 2025 08:30
Copy link
Collaborator

@savingoyal savingoyal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick UX feedback - let me know if in the new proposed UX we miss out on any use cases. i am reviewing the rest of the PR meanwhile.

Copy link
Contributor

@romain-intel romain-intel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a review -- just some comments.

@talsperre talsperre force-pushed the dev/add-runtime-dag-info branch from 75c1301 to bc9e456 Compare February 12, 2025 11:49
yield Task(pathspec=task_pathspec, _namespace_check=False)

@property
def parent_tasks(self) -> List["Task"]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to handle the case where the user is inspecting a task that ran using an old version of metaflow or is using an old version of the service...

Comment on lines 1147 to 1158
task_pathspecs = self._metaflow.metadata.filter_tasks_by_metadata(
flow_id, run_id, step.id, metadata_key, metadata_pattern
)
except Exception as e:
if e.http_code == 404:
# filter_tasks_by_metadata endpoint does not exist in the version of metadata service
# deployed currently. Raise a more informative error message.
raise MetaflowInternalError(
"The version of metadata service deployed currently does not support filtering tasks by metadata. "
"Upgrade to a newer version of Metadata service to use this feature."
) from e

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this code block can be within the service implementation and not in the client. the client shouldn't have to bother about the actual implementation semantics of metadata.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the endpoint does not exist in the metadata service then how do you even return that error message?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this try-except block can be in service.py rather than in the client.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, that sounds reasonable.

@@ -160,6 +160,15 @@ def __init__(self, msg, unhandled):
self.artifact_names = unhandled


class ServiceException(MetaflowException):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why move this here? - this is tied to the service implementation and ideally shouldn't be pulled into the core

# deployed currently. Raise a more informative error message.
raise MetaflowInternalError(
"The version of metadata service deployed currently does not support filtering tasks by metadata. "
"Upgrade to a newer version of Metadata service to use this feature."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add the metaflow service version that they need to upgrade to? basically if the latest is 2.3.4, then, have a message that says upgrade to >=2.4.0

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @saikonen to bump the version of metaflow-service when the endpoint is released.

@talsperre talsperre merged commit 333eeeb into master Feb 20, 2025
29 checks passed
@talsperre talsperre deleted the dev/add-runtime-dag-info branch February 20, 2025 21:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants