sync metamodel format #751

AlexanderLanin · 2025-03-21T08:58:59Z

Our current metamodel.yml follows a different approach than the one by ubcode. We should align.

Participants:

While we were just talking about contributing, @ubmarco has already prepared a PR at useblocks/sphinx-needs#1441. Now we really urgently need to compare the approaches.

First draft (please edit this message!)

-	useblocks/sphinx-needs#1441	S-CORE
Extension	Directly embedded into sphinx-needs	Extension on top of sphinx-needs
Approach	Rather generic	Focused purely on S-CORE process requirements. More limited?
Focus	Rather generic	Readability and writeability of config
Stage	Theoretical?	In use
Number of links	min and max are user defined	optional and mandatory links. No need for min and max.
Severities	Info, Warning, Violation etc	All violations are fatal errors
Type validations	str, int, bool etc	planned, but so far only regex
Pretty errors	included	planned
Tests	a few, difficult to write	yes
Schema format	json	yaml (for human readability/writeability)
Types / complex checks	triggers for local checks, local checks	local checks, graph checks
Example config	https://github.com/useblocks/sphinx-needs/blob/mh-schema-validation/tests/doc_test/doc_schema/schemas.json	https://github.com/eclipse-score/docs-as-code/blob/main/src/extensions/score_metamodel/metamodel.yaml

Open points:

We'll probably need walkthroughs to understand our respective approaches?

AlexanderLanin · 2025-05-15T06:54:54Z

Let's have a look at the 19 requirements layed out in useblocks/sphinx-needs#1451

❓ = not quite sure what that requirement means. Needs clarification with @ubmarco

req	short	useblocks/sphinx-needs#1441	S-CORE
1	Schema definition format shall be declarative and agnostic to specific programming languages	json ✅	yaml ✅
2	Tool and library support shall be available for Python and Rust.	❓	❓
3	Tool and library support shall be available for Linux (x64, arm64), MacOS/Darwin (arm64) and Windows (x64).	❓	❓
4	Mapping of need items to schema types shall be part of the declarative description.	❓	❓
5	The solution shall support the definition of default values for extra options.	❓	❓
6	The solution shall work for both core options/links and extra options/links.	❓	❓
7	The solution shall support the required semantics of option and link fields.
8	The solution shall support the conditional required semantics of option and link fields, i.e. the existence of a field depends on other field values.	✅	❔
9	The solution shall offer the following need option data types	✅	only via regex
10	The solution shall support at least the following string formats	✅	only via regex
11	The solution shall support string regex patterns.	✅	✅
12	The solution shall support disallowing additional properties (closing the model).	✅	✅
13	The solution shall support need graph validation, i.e. outgoing need links shall be target to constraints.	✅	✅
14	The solution shall be fast to execute for local needs	❓	not measured. should be as fast as python allows
15	The solution shall be fast to execute for need graphs	❓	not optimized
16	The solution should follow an established standard.	❓	❌ (intentional)
17	The solution should feature an extension or composition mechanism to re-use base definitions	❓	❌ (intentional)
18	The solution may support rendering a visualisation of the schema types and links between them.	❓	❌ (Reading as "should": it's not implemented)
19	The solution may calculate or aggregate data during validation.	❓	❌ (Reading as "should": not needed)

ubmarco · 2025-05-15T08:09:13Z

Hi @AlexanderLanin thanks for the efforts you and the team put in to make Sphinx-Needs even more useful.
Schema validation & ontology is on my mind since months and after the last Sphinx-Needs user group meeting I invested quite some thoughts to find a suitable solution and move the ecosystem further.

Thanks also for this write-up and comparison. Ultimately I think that Sphinx-Needs requires an internal way to describe the metamodel. I say that because we are building tools around Sphinx-Needs such as ubCode and its companion CLI app ubc that require a reliable schema interface to support the user in real-time as they type. Some of above requirements (e.g. Python+Rust, fast execution, split between local and graph validation, typing, default values) stem from this.

I consider typing crucial for the solution as it allows typed import and exports. You would not put a bool or integer into a string in a (graph) database. It also allows to connect Sphinx-Needs with stricter Engineering-as-Code solutions.
Or even change the internals of Sphinx-Needs to be typed for user provided fields.

Other requirements (graph validation over multiple nodes, user provided messages and severity, composition mechanism) stem from requirements of other Sphinx-Needs users that build their solutions around RDF and SHACL. These descriptions are much more powerful but also less well known in the community, so my goal is to build something that is at least compatible and transpilable.

My PR exists to showcase how a solution could look like that ticks all shall boxes.
I wrote around 50 test cases to find bugs in my own code and also do some performance testing.
I consider the test structure not a deciding factor as this can be improved without bothering the end user.
I want to build something that considers use cases of the bigger ecosystem, while still being familiar for developers.

I think we should organize a meeting (next week?) to align on this. In the meantime I will look a bit into your metamodel format and write some docs for my solution.

(Btw, I cannot edit your messages)

AlexanderLanin · 2025-05-15T09:53:43Z

speed: agreed, I did not mark 14-15 as done, since it's not quite clear what "fast" is. And we did not measure. And it's probably the same anyway for python.

typing: agreed. So far we simply don't have a use case for it. But we want to add it anyway. Most notably an enum support, instead of writing regex.

testing: we are rather fond and happy with our rst based tests. Combined with the (hopefully readable) yaml config, this allows non developers to specify exactly how they want the metamodel to behave.

the others: let's see whether both solutions satisfy all requirements in detail

meeting: invitation is out to @danwos, please make sure he forwards the meeting. Public announcement at #236

So far my feeling is that the solutions are similar, although completely different. Ours is more specific, focuses on our use cases and on readable config. So from a very high level it might be possible for us to use our yaml frontend with your backend. Which does sound quite reasonable in general for any user facing software architecture. From your point of view, ours might have something that you don't have so far (at least a simpler architecture 😉), and a real life use case.

ubmarco · 2025-05-21T16:22:21Z

I looked through the referenced metamodel example. Looks understandable from a user perspective and I think its built for the use cases you have.

Let me lay out some difficulties I see and that I want to discuss tomorrow:

1. A lot of duplication (e.g. safety: "^(QM|ASIL_B|ASIL_D)$" appears 18 times). Source of this is a missing composition mechanism.
1. Regexes are not particularly fast (think of the performance on big models)
1. Regexes are used as a replacement for a proper typing system
- Booleans should be upper/lower case versions of yes/no/true/false/1/0, but at the end I just want to say shall be a bool instead of "^(YES|NO|yes|no|true|false|TRUE|FALSE...)$"
- Integer would have to be given as a regex (how would you build a minimum constraint?)
- Floats/Decimals are quite hard to express with regex (not speaking about range constraints yet)
- Some string formats are particularly hard to write as regex (emails/urls/ISO 8601 datetimes)
- Sphinx-Needs has tags which is actually an array.
1. Optional and mandatory options have to be given as "^.*$" or "^.+$" to mark them as required.
1. graph_check
- Are the graph_check conditions Python expressions (e.g. "safety != QM")? If yes, it's a problem for other languages consuming the same schema
```
arch_safety_linkage:
  needs:
    include: "comp_req, feat_req"
    condition:
      and:
        - "safety != QM"
        - "status == valid"
```
- How are conditions evaluated, can I have both and and or and not in a single check? If yes, how is that evaluated?
- Is it possible to build check conditions that hop over multiple graph nodes? My observation: for link fields, needs_types only works on string IDs while graph_checks select certain needs and then look at their direct neighbors.
- How would I express that I need max 2 links of a certain structure for an extra link field?

AlexanderLanin · 2025-05-22T02:48:36Z

1. We did actively decide against composition, as it increases complexity. Not typical for programming, but we decided that lowering the barrier to "how do I use that" is more important than "I have a single place to configure complex abstractions that I could not write on my own". Main motivation is that we want the metamodel.yml to be writeable by non programmers. They can manage a search-replace operation... hopefully.
2+3) We want to introduce a typing system. Most notably enums.
1. We have mandatory_options and optional_options (horrible names). Same with mandatory_links and optional_links.
1. The expressions are a very limited set of allowed expressions with manual parsing. The current notation is hard to read, and we want to rework it to be more user friendly.
- And and or in the same condition are currently not supported (no demand).
- Checks over multiple graph nodes are currently not supported (no demand).
- No min/max restrictions on links supported (no demand).

AlexanderLanin added the community:infrastructure General Score infrastructure topics label Mar 21, 2025

AlexanderLanin added this to Infrastructure Mar 21, 2025

github-project-automation bot moved this to Todo in Infrastructure Mar 21, 2025

AlexanderLanin moved this from Todo to In Progress in Infrastructure May 15, 2025

AlexanderLanin assigned AlexanderLanin and MaximilianSoerenPollak May 15, 2025

AlexanderLanin mentioned this issue May 15, 2025

docs: add tests for graph checks #553

Open

github-project-automation bot added this to S-CORE Roadmap May 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

sync metamodel format #751

sync metamodel format #751

AlexanderLanin commented Mar 21, 2025 •

edited

Loading

AlexanderLanin commented May 15, 2025 •

edited

Loading

Uh oh!

ubmarco commented May 15, 2025

Uh oh!

AlexanderLanin commented May 15, 2025

Uh oh!

ubmarco commented May 21, 2025 •

edited by AlexanderLanin

Loading

Uh oh!

AlexanderLanin commented May 22, 2025

Uh oh!

sync metamodel format #751

sync metamodel format #751

Comments

AlexanderLanin commented Mar 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

AlexanderLanin commented May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ubmarco commented May 15, 2025

Uh oh!

AlexanderLanin commented May 15, 2025

Uh oh!

ubmarco commented May 21, 2025 • edited by AlexanderLanin Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

AlexanderLanin commented May 22, 2025

Uh oh!

AlexanderLanin commented Mar 21, 2025 •

edited

Loading

AlexanderLanin commented May 15, 2025 •

edited

Loading

ubmarco commented May 21, 2025 •

edited by AlexanderLanin

Loading