Skip to content

Clarify federation extension fields #564

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: draft
Choose a base branch
from
Open

Clarify federation extension fields #564

wants to merge 4 commits into from

Conversation

m-mohr
Copy link
Member

@m-mohr m-mohr commented May 9, 2025

@soxofaan
Copy link
Member

soxofaan commented May 9, 2025

From the standpoint of the federation component, I find the concepts "temporary"/"permanent" hard to work with. In the aggregator I can only observe that a backend is not available now, but I have no idea if it will be back online within 1 minute, 1 day, 1 month or never.

Wouldn't it be better to express "missing" in terms of (un)expected or (un)intended?

@m-mohr
Copy link
Member Author

m-mohr commented May 9, 2025

Fair, is pretty similar, but we can also rephrase it.

@christophfriedrich
Copy link
Contributor

Thanks for plunging forwards Matthias and revising this extension spec -- working with it over the past weeks I had several moments where I misunderstood things or they were not clear to me until you pointed out how they were meant originally, so I'm appreciating an overhaul!

I think it would help to restructure the document by switching the order of explaining federation:backends and federation:missing, and by adding another section before that:

  1. Clients will assume all responses are a combination of all backends
  2. If this is generally not the case for something (because not all backends support it, or the aggregator doesn't support it for all backends, or other structural reasons), this is advised through federation:backends, a whitelist of backends that should be there
  3. If this aimed-for situation can't be satisfied in a specific instance, this is advised through federation:missing, a blacklist of current exceptions

To me this order makes more sense logically, and it would prevent implementers/learners from going "ahhh, if it's missing [for any reason] I have to put it in missing" before having learned about the other method.

Regarding "only observing that a backend* is not available now":
I would expect an aggregator to check each backend's capabilities, so e.g. if it claims to support GET /files but then that endpoint reports errors, or incorrect syntax, or nothing at all, it's federation:missing (aimed-for situation not satisfiable). But if that endpoint wasn't in the backend's capabilities in the first place, this should trigger a federation:backends entry without that backend. Likewise if the endpoint was in the backend's capabilities but the aggregator chooses to exclude it for its own reasons.

So yes, maybe expected/intended is a more fitting concept than time-based ones, but yeah they are quite similar, and I would mention "temporarily" anyway as often this will be the case. And as I did in the 1-2-3 list, I would also mention the words "in general"/"exception" to bring the point across.

* Note this can not only be an entire backend, but also just individual endpoints (e.g. due to an unavailable microservice, incorrect response, ...)

Co-authored-by: Christoph Friedrich <[email protected]>
@m-mohr
Copy link
Member Author

m-mohr commented May 12, 2025

Yeah, I agree. I actually was about to change the order of the sections, but then didn't do it to keep the git changelog clean. It would've been hard to check changes when the section order would have changed. So I'll do that now, after we have initially reviewed the general content...

@m-mohr m-mohr requested a review from christophfriedrich May 12, 2025 19:50
Copy link
Member

@soxofaan soxofaan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to make sure I understand what is being aimed for here:

Say we have this federation and we're looking at the GET /jobs endpoint:

  • backend A is online and support everything
  • backend B is temporarily offline (for some definition of "temporarily")
  • backend C is offline for weeks (so not temporarily) but might come online soon (or not)
  • backend D is online but does not support batch jobs
  • backend E is online (partially), but its batch jobs subservice is down "temporarily"

What should go under GET /jobs of the federated response?
If I understand the current PR:

  • federation:backends: A, B and E
  • federation:missing: B, and E

is that correct?

## Resources supported only by a subset of back-ends

Every discoverable resource that is defined as an object and allows to contain additional properties, can list the subset of back-ends that support or host the exposed resource/functionality.
Every discoverable resource that is defined as an object and allows to contain additional properties, can list the subset of back-ends that permanently support or host the exposed resource/functionality.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... back-ends that permanently support ...

from the standpoint of the aggregator or another federation component, I find it hard to implement this label "permanent".

I understand that you want to ignore temporary unavailability or other unintended glitches, but it still seems to imply that there is some kind of contract (outside the scope of the basic openEO API between the aggregator and each backend that expresses some kind of SLA about the resources.

I'm fine with this contract being out-of-band of this specification, but that means that the value of the "permanent" label is very fuzzy and might cause confusion.

Copy link
Member Author

@m-mohr m-mohr May 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Permanently means that it's listed in the endpoints list or is returned by another list that's relevant in the context of the endpoint (e.g. the collection list).

Given the clarification in #564 (comment), maybe we can write something that doesn't depend on "temporary"/"permanently"?

I don't really think about much magic here ;-)

@m-mohr
Copy link
Member Author

m-mohr commented May 23, 2025

@soxofaan The backends in a federation are listed in federation (in GET /), so they are all meant to be available, otherwise you wouldn't list them there. All backends from that list that are not online, are temporarily not available. If it would be permanent, you'd remove them from federation. A differentiation for how long something is offline is not done, the baseline is always the federation list/config. That's the idea.

With regards to your example, I'm assuming that B and C are supporting jobs, this is not clear from your bullet points. So the lists would be filled as follows:

  • federation:backends: A, B, C and E
  • federation:missing: B, C, and E
  • (federation in GET /: A, B, C, D, E - the federation operators may want to consider removing C though ;-) )

Means the aggregator only responds with data from backend A currently.

Might be a good idea to add these examples to the extension and potentially also define permanent/temporary as listed above.
Does this make sense?

@soxofaan
Copy link
Member

All backends from that list that are not online, are temporarily not available. If it would be permanent, you'd remove them from federation. A differentiation for how long something is offline is not done, the baseline is always the federation list/config. That's the idea.

If there is no differentiation, isn't it better in the specification/description to avoid terminology that suggests such differentiation, like "permanent" or "temporary"?

I guess I'm fully aligned with your underlying intent, but these words "permanent"/"temporary" throw me off in a different direction.

@soxofaan
Copy link
Member

soxofaan commented May 26, 2025

The backends in a federation are listed in federation (in GET /), so they are all meant to be available, otherwise you wouldn't list them there. All backends from that list that are not online, are temporarily not available. If it would be permanent, you'd remove them from federation.

My problem is that a federation component (like openeo-aggregator) might be too "dumb" to make the call if unavailability is temporary or permanent. Or put differently: the decision to remove a backend from the federation's GET / should probably be done by a human with intent, not in automatically by software based on some vague notion of temporary/permanent.

So you can't rule out the situation that a "permanently" unavailable backend is listed by a federation, because no human action was done yet for some reason.

@m-mohr
Copy link
Member Author

m-mohr commented May 26, 2025

these words "permanent"/"temporary" throw me off in a different direction.

How would you phrase it?

the decision to remove a backend from the federation's GET / should probably be done by a human with intent

Yes, that's what I meant.

@m-mohr m-mohr requested a review from soxofaan June 15, 2025 15:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants