Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify the result of queryaction operation #2068

Open
relu91 opened this issue Jan 8, 2025 · 16 comments
Open

Specify the result of queryaction operation #2068

relu91 opened this issue Jan 8, 2025 · 16 comments
Labels
Defer to TD 2.0 Has Use Case Potential The use case can be extracted and explained manageable affordances discussions on representing long running affordances that need to be managed

Comments

@relu91
Copy link
Member

relu91 commented Jan 8, 2025

The current specification for the queryaction operation lacks clarity regarding the expected result. We have the corresponding data schema definition for other operations, while here, we don't. Probably, we should discuss introducing an additional field next to input and output called status where one can define the shape of data coming as result of queryaction or in the future observeaction.

@github-actions github-actions bot added the needs-triage Automatically added to new issues. TF should triage them with proper labels label Jan 8, 2025
@egekorkan egekorkan added manageable affordances discussions on representing long running affordances that need to be managed and removed needs-triage Automatically added to new issues. TF should triage them with proper labels labels Jan 8, 2025
@zolkis
Copy link

zolkis commented Jan 8, 2025

Querying an action morphologically is similar to a microservice API, where the questions are:

  • what are the query parameters?
  • what are the return values corresponding to query input (list of parameters)?
    This can be a Thing interface in its most generic form, as the query parameters could be anything with a schema.

I suggest serious simplifications here, which we can safely make IMHO.
The basic meaning of an action affordance is to start an async (or sync) operation that has an outcome and can possible be monitored and controlled.
So we need means to start (invoke), cancel (with or without guarantees), stop(pause)/resume (if supported), get progress (percentage) and other information from the action performer(s).

That means WoT specifications MUST contain algorithms what kind of service to start together with an action and what is its lifecycle and interface. Things might choose to not implement these and report error or just time out. On the client side the WoT spec should contain algorithmic steps on how to deal with these situations.

I propose to start with a simple solution that allows later extensions and composition.

  • Invoking an action should return a control object, which communicates with the action performer.
  • The client may initiate canceling the action, which may fail or succeed (can be modeled with a Promise).
  • The client may query a status of the action, which semantically may be ("running", "completed", "error", ...), but there might be some protocol specific extra information that could be transparently mapped by implementations. So this part could be modeled as two properties on the control object, "last known status" and "additional status information". Getters seem to be enough here, since implementations should cache information available via events/notifications from the other end. However, if we want to be able to also poll the status, we could model both of these use cases by the means of Promises, which can return right away (in the event loop) when information is available, basically behaving as a getter - but also allowing active (async or sync) querying of the status.
    So this part could be modeled by a status() function returning a Promise that resolves with a tuple <status, details>, or with an object that initially contains properties for "status" and "statusDetails", and then possibly others later.
  • We could start with this and skip pause(stop)/resume controls, eventually adding them later.

@benfrancis
Copy link
Member

benfrancis commented Jan 8, 2025

@relu91 wrote:

The current specification for the queryaction operation lacks clarity regarding the expected result. We have the corresponding data schema definition for other operations, while here, we don't.

I agree it is unclear how queryaction is meant to work.

Probably, we should discuss introducing an additional field next to input and output called status where one can define the shape of data coming as result of queryaction or in the future observeaction.

This is an interesting idea that I'm not sure has been suggested before. One complication is the difference in behaviour between synchronous and asynchronous actions. With synchronous actions there may just be an input in a request and an output in a response. With asychronous actions there may be no output in an initial response, but the response to a queryaction operation may contain the output as part of the status. How do you deal with the relationship between the output schema and the status schema?

@zolkis wrote:

we need means to start (invoke), cancel (with or without guarantees), stop(pause)/resume (if supported), get progress (percentage) and other information from the action performer(s).

Note that we currently have invokeaction, queryaction, cancelaction and queryallactions operations. There is no operation to pause an action and the expected response to a queryaction operation is unclear.

I propose to start with a simple solution that allows later extensions and composition.

  • Invoking an action should return a control object, which communicates with the action performer.
  • The client may initiate canceling the action, which may fail or succeed (can be modeled with a Promise).
  • The client may query a status of the action, which semantically may be ("running", "completed", "error", ...), but there might be some protocol specific extra information that could be transparently mapped by implementations. So this part could be modeled as two properties on the control object, "last known status" and "additional status information". Getters seem to be enough here, since implementations should cache information available via events/notifications from the other end. However, if we want to be able to also poll the status, we could model both of these use cases by the means of Promises, which can return right away (in the event loop) when information is available, basically behaving as a getter - but also allowing active (async or sync) querying of the status. So this part could be modeled by a status() function returning a Promise that resolves with a tuple <status, details>, or with an object that initially contains properties for "status" and "statusDetails", and then possibly others later.
  • We could start with this and skip pause(stop)/resume controls, eventually adding them later.

You are describing this as if it's a JavaScript API, but a Thing Description is just metadata which needs to be able to describe a range of different ways actions may be modelled in different protocols.

Just a reminder of the existing work done on this in the HTTP Basic Profile which provides an example of how synchronous and asynchronous actions may be modelled in HTTP, including an example of an ActionStatus payload format. Also the previous discussion in #1408 and #1665 about the ambiguity around queryaction and cancelaction operations. (The lack of a way to describe dynamic resources is also a related problem).

@zolkis
Copy link

zolkis commented Jan 8, 2025

Note that we currently have invokeaction, queryaction, cancelaction and queryallactions operations.

Right, we also need to specify the result of queryallactions op.

a Thing Description is just metadata which needs to be able to describe a range of different ways actions may be modelled in different protocols.

Yes, we can start doing that :). I deliberately used scripting terms, partly continuing today's Scripting discussion in the wrong repo/issue :), but also as an input/nudge from a developer point of view. Modeling should allow security hardening, for which validation (in the most commonly used platforms) is essential.

@benfrancis
Copy link
Member

How do you deal with the relationship between the output schema and the status schema?

I suppose you could use a JSON Pointer to include the output schema in the status schema...?

My next question would be how does a Consumer know what the status schema means?

@relu91
Copy link
Member Author

relu91 commented Jan 10, 2025

Thank you for all the inputs, as discussed in the call, I'm afraid that the simple solution presented in my first post does not really cover the complexity of our use cases space. It is better to look for a more robust and consistent proposal and the discussion should be framed next to the existing discussions around "manageable actions" and/or "how to express relationships between affordances". There we should handle all the complex scenarios described above by @benfrancis, with maybe the expection of

My next question would be how does a Consumer know what the status schema means?

Which I think would be pretty dependent on (exposer or consumer) application logic (or action types).

Here we should answer, how we are going to deal with this grey space left by the spec for TD 1.1 implementers. I think that given nothing is guaranteed we should always assume the worst (in lack of out-band information, like the protocol used).

Yes, we can start doing that :). I deliberately used scripting terms, partly continuing today's Scripting discussion in the wrong repo/issue :), but also as an input/nudge from a developer point of view. Modeling should allow security hardening, for which validation (in the most commonly used platforms) is essential.

Let's continue the discussion there.

@egekorkan
Copy link
Contributor

TD Call Today:

  • Any fix for TD 1.1 would be at least an errata. We can fix it for TD 2.0.
  • Leave it underspecified for TD 1.1 and let Consumer applications (not necessarily just Scripting API based ones) find out the behavior and data structure with "trial and error" or out-of-band documentation.
  • In Scripting API: the behavior will be left open and correctly specified later on.
  • In TD 1.1: No changes will be done to the spec
  • In TD 2.0: The issue will be fixed

@benfrancis
Copy link
Member

benfrancis commented Feb 11, 2025

I keep thinking about @relu91's idea of a status data schema and whether it could actually work, as part of a wider solution.

Building on the example I gave in w3c/wot-binding-templates#408, I've tried to describe the action queue defined in the HTTP Basic Profile using a status data schema.

{
  "actions": {
    "fade": {
      "title": "Fade",
      "description": "Fade the lamp to a given level",
      "synchronous": false,
      "input": {
        "type": "object",
        "properties": {
          "level": {
            "type": "integer",
            "minimum": 0,
            "maximum": 100,
            "unit": "percent"
          },
          "duration": {
            "type": "integer",
            "minimum": 0,
            "unit": "milliseconds"
          }
        }
      },
      "output": {
        "type": "boolean"
      },
      "status": {
        "type": "object",
        "properties": {
          "status": {
            "type": "string",
            "enum": [
              "pending",
              "running",
              "completed",
              "failed"
            ]
          },
          "output": {
            "$ref": "#../../../output"
          },
          "error": {
            "type": "object"
          },
          "href": {
            "type": "string",
            "format": "uri",
            "const": "./actions/fade/{id}"
          },
          "timeRequested": {
            "type": "string",
            "format": "date-time"
          },
          "timeEnded": {
            "type": "string",
            "format": "date-time"
          }
        },
        "required": [
          "status"
        ]
      },
      "forms": [
        {
          "op": "invokeaction",
          "href": "actions/fade",
          "htv:methodName": "POST",
          "response": {
            "htv:statusCodeNumber": 201,
            "htv:headers": [
              {
                "htv:fieldName": "Location",
                "htv:fieldValue": "actions/fade/{id}"
              }
            ]
          }
        },
        {
          "op": "queryaction",
          "href": "actions/fade/{id}",
          "htv:methodName": "GET"
        },
        {
          "op": "cancelaction",
          "href": "actions/fade/{id}",
          "htv:methodName": "DELETE"
        }
      ],
      "uriVariables": {
        "id": {
          "@type": "htv:ActionID",
          "type": "string",
          "description": "identifier of action request"
        }
      }
    }
  },
  "forms": [
    {
      "op": "queryallactions",
      "href": "actions",
      "htv:methodName": "GET"
    }
  ]
}

This is quite verbose, especially if it needs to be repeated for every action, but it could potentially work. A TD author could use schemaDefinitions to define the action status payload once and then reference it throughout a TD.

Note that I've also used the $ref keyword from JSON Schema (not sure if I used it correctly) to nest the output schema inside the status schema using a JSON Pointer, though that may not work in conjunction with the schemaDefinitions idea above. I'm also not sure if my use of the const keyword is valid.

Other questions that would need to be answered:

@lu-zero
Copy link
Contributor

lu-zero commented Feb 11, 2025

We could avoid more ad-hoc operations and/or even thin the number further and uniform the behavior by making it more composable e.g.:

  • properties would have only read and write

  • actions would have only invoke and cancel

  • events would have only subscribe and unsubscribe

  • Observability for properties and actions would be an event bound to the other affordance by a status relationship

  • Synchronous polling would be described by a property bound to the action by the same status relationship.

Reusable components would allow to have fairly compact TDs.

@benfrancis
Copy link
Member

benfrancis commented Feb 11, 2025

@lu-zero wrote:

Synchronous polling would be described by a property bound to the action by the same status relationship.

Making action status a property doesn't solve the problem of needing dynamic resources to represent the status of each concurrent action request. It just means the Consumer now needs to reason about the relationship between three different affordances, complicating implementation further.

We could avoid more ad-hoc operations and/or even thin the number further and uniform the behavior by making it more composable

Whilst I can see why this might seem neat, I strongly dislike this idea, for reasons I'm happy to expand upon but probably off topic for this issue.

@lu-zero
Copy link
Contributor

lu-zero commented Feb 11, 2025

"dynamic resources" aren't just for querying the status so I wouldn't have something ad-hoc for that specific situation as much as I'd rather not have something ad-hoc like uriVariables while you might want to model something like "take that subset of the output data schema and provide it to the http header when requesting this property".

It isn't much different from modeling a database access as properties e.g. the infamous weather forecast service.

@benfrancis
Copy link
Member

@lu-zero wrote:

"dynamic resources" aren't just for querying the status

That seems to be a general assumption, which is why we have been waiting for years for a general purpose solution. And yet in practice I've personally never come across another use case when describing a connected device as a Web Thing, have you?

It isn't much different from modeling a database access as properties

Devices are not databases and we are not designing a database API.

I've spent a lot of time trying to find alternative ways of modelling multiple concurrent long-running actions in a HTTP REST API that don't involve creating separate resources to represent status, and I haven't been able to find a good one.

Regardless, if everything has to be describable in a TD then we need to be able to describe real REST APIs, which do model long running operations this way.

I think @relu91 is right that we need a data schema to describe action status, as distinct from the input and output of the action.

@lu-zero
Copy link
Contributor

lu-zero commented Feb 11, 2025

A printer is a device, you wouldn't want to have a separate resource for each job.

I agree we need at least 1 data schema, possibly 2.

@TallTed
Copy link
Member

TallTed commented Feb 11, 2025

A printer is a device, you wouldn't want to have a separate resource for each job.

Back when dinosaurs walked the earth, print jobs were submitted to shared printers, and each got its own identifier, which was made known to the submitter. This enabled submitters to watch the queue. They could see that their job had finished, so they would know to go to the delivery area; or they might see that their job had failed, for whatever reason (postscript errors happen when you're writing your own printer driver, hand-coding postscript, etc., as part of your Computer Science coursework; there are also paper jams that can require a resubmission of a page or a job; etc.)....

There are more reasons you might want to have a separate resource for each job.

@lu-zero
Copy link
Contributor

lu-zero commented Feb 11, 2025

In that case you have an identifier and you use it on a single route (per command) that is not dynamically produced.

@benfrancis
Copy link
Member

A printer is a device, you wouldn't want to have a separate resource for each job.

This is actually exactly the kind of use case I have in mind, e.g. Sending print jobs to a thermal printer.

Modelling a queue or long running operations as multiple resources seems to be a very established pattern in REST APIs.

The default HTTP method for invokeaction is POST, which is used for creating new resources.

In that case you have an identifier and you use it on a single route (per command) that is not dynamically produced.

I'm not sure how that makes any difference.

@lu-zero
Copy link
Contributor

lu-zero commented Feb 12, 2025

The problem we have is that, besides with security schemes and uriVariables, we do not have a way to model a property that takes the identifier as argument.

If we address that we should be fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Defer to TD 2.0 Has Use Case Potential The use case can be extracted and explained manageable affordances discussions on representing long running affordances that need to be managed
Projects
None yet
Development

No branches or pull requests

6 participants