Skip to content

Add example mixing use of rml-views and gather maps #30

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
frmichel opened this issue Jan 18, 2023 · 6 comments
Open

Add example mixing use of rml-views and gather maps #30

frmichel opened this issue Jan 18, 2023 · 6 comments
Labels
documentation Improvements or additions to documentation proposal

Comments

@frmichel
Copy link
Collaborator

frmichel commented Jan 18, 2023

This example should be added only when the fields specification is released.
Anyway, the issue does not prevent from releasing a first version of the specification.

See existing example: #10 (comment)

@dachafra dachafra added blocked and removed blocked labels Jan 18, 2023
@frmichel frmichel added pending Issue is pending due e.g. to a dependency with another or only applicable to a further release documentation Improvements or additions to documentation labels Jan 18, 2023
@frmichel
Copy link
Collaborator Author

frmichel commented Sep 11, 2023

@dachafra @chrdebru @andimou : To finalize this example I need the spec of the Fields (rml:field... ) but I fail to find it... Shouldn't it be in rml-io? Did i miss something?

@dachafra
Copy link
Member

Fields is still in discussion. Indeed, in the joins task force, we are going to present the solution again and see if it covers all the requirements that we need. In any case, the issues that are labeled with pending do not need to be fixed now, as others need to be resolved before (as in this one)!

@frmichel
Copy link
Collaborator Author

This question was discussed during the meeting in Santiago de Compostela. Here is the solution that was agreed:
@dachafra @chrdebru @andimou @bjdmeest @pmaria: could you please confirm that we understood the same thing?

Input document "source.json":

{ 
  "id": 1,
  "a" : [ [1,2,3], [4,5,6] ]
}

The expected output is to generate a list of lists, where the head node of the outer list has a URI:

<http://my.list/1>
  rdf:first (1 2 3);
  rdf:rest [ 
    rdf:first (4 5 6);
    rdf:rest rdf:nil 
  ].

The solution involves a logical source for the json file, and a logical view that declares fields:

<LS> 
    a rml:LogicalSource ;
    rml:source "source.json" ;
    rml:referenceFormulation ql:JSONPath;
].

<LV>
    a rml:LogicalView ;
    rml:logicalSource <LS> ;
    rml:field [ 
      rml:fieldName "id" ;
      rml:reference "$.id" ;
    ] ;
    rml:field [ 
      rml:fieldName "a_string" ;
      rml:reference "$.a.*" ;
      rml:field [ 
        rml:fieldName "a_list" ;
        # Reference "$.*" applies to the results produced by evaluating "$.a.*"
        rml:reference "$.*" ;
        rml:groupBy "a"
      ]
    ]
].

This reshapes the input document into a table with 2 lines:

id a_string a_list
1 [1,2,3] 1,2,3
1 [4,5,6] 4,5,6

Here, a_string is a string representation of the json array: "[1,2,3]", "[4,5,6]", so not really usable unless it would be parsed by a function for instance.
Conversely, a_list is built by parsing the json arrays returned by "$.a.*", so it is an enumeration of the terms 1, 2, 3, and 4, 5, 6.
The datatype of that list field has not been specified yet and requires further discussions.

Now applying the following term map should generate the expected result:

rr:subjectMap [ 
  rr:template "http://my.list/{id}";
  rml:gather ( [ 
    rml:gather ( [
      # "a_list" is multi-valued, it evaluates to the 3 terms  in each arrays: first iteration: 1, 2, 3, then 4, 5, 6.
      rml:reference "a_list" ;
      rml:gatherAs rdf:List;
    ] )
  ] );
  rml:gatherAs rdf:List;
] ;

@pmaria
Copy link

pmaria commented Dec 17, 2023

Hi @frmichel,

Indeed! Only the object of the rml:groupBy statement would be a_string, but I think that was your intention.

So the logical view would be defined as such:

<LV>
    a rml:LogicalView ;
    rml:logicalSource <LS> ;
    rml:field [ 
      rml:fieldName "id" ;
      rml:reference "$.id" ;
    ] ;
    rml:field [ 
      rml:fieldName "a_string" ;
      rml:reference "$.a.*" ;
      rml:field [ 
        rml:fieldName "a_list" ;
        # Reference "$.*" applies to the results produced by evaluating "$.a.*"
        rml:reference "$.*" ;
        rml:groupBy "a_string"
      ]
    ]
].

@frmichel
Copy link
Collaborator Author

Yes you're right, sorry for the typo. I renamed a and a2 into a_string and a_list for clarification, but I forgot to change a in groupBy ;).

@dachafra dachafra added proposal and removed pending Issue is pending due e.g. to a dependency with another or only applicable to a further release labels Dec 20, 2023
@dachafra
Copy link
Member

Ok for me, I've also changed the label from pending to proposal

@dachafra dachafra changed the title Add example mixing use of fields and gather maps Add example mixing use of rml-views and gather maps Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation proposal
Projects
None yet
Development

No branches or pull requests

3 participants