Skip to content

Principle #9 users - automated validation #1008

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
beckyjackson opened this issue Aug 9, 2019 · 10 comments
Open

Principle #9 users - automated validation #1008

beckyjackson opened this issue Aug 9, 2019 · 10 comments
Labels
attn: Editorial WG Issues pertinent to editorial activities, such as ontology reviews and principles attn: Technical WG Issues pertinent to technical activities, such as maintenance of website, PURLs, and tools automated validation of principles Issues for the editorial WG pertinent to the automating the validation of the Principles. principles Issues related to Foundry principles

Comments

@beckyjackson
Copy link
Contributor

FP 9 - Documented Plurality of Users

Automated checks:

  1. Is there a valid issue tracker?
  2. Are there stated usages?

Mechanism:

We can pull the tracker value from the ontology YAML. We should ensure that this tracker resolves (does not return HTTP status > 400). It would be nice to check if there is activity on the tracker, but I'm not sure if that is possible at this time. I'm open to suggestions. If the ontology does not have a tracker, this check fails.

We can also look at the usages tag from the ontology YAML. If there are no documented usages, the ontology will get a warning. The usages should contain a user property with a valid URL. Perhaps if the URL does not resolve, we just return an info message.

We may need to standardize the usages tag. Currently, there are multiple ways that people have inserted usages. For example, ENVO contains two different examples of usages:

usages:
 - type: data-annotation
   description: "describing species habitats"
   examples:
     url: http://eol.org/pages/211700/data
   resources:
     url: http://eol.org
     label: EOL
usages:
  - user: http://oceans.taraexpeditions.org/en/
    description: Samples collected during Tara Oceans expedition are annotated with ENVO
    example:
      - url: https://www.ebi.ac.uk/metagenomics/projects/ERP001736/samples/ERS487899
        description: "Sample collected during the Tara Oceans expedition (2009-2013) at station TARA_004 (latitudeN=36.5533, longitudeE=-6.5669)"

I propose the following format for usages:

usages:
  - user: required URL
    type: optional text
    description: required text
    example:
      - url: required URL
        description: required text
@beckyjackson beckyjackson added attn: Editorial WG Issues pertinent to editorial activities, such as ontology reviews and principles attn: Technical WG Issues pertinent to technical activities, such as maintenance of website, PURLs, and tools labels Aug 9, 2019
@beckyjackson beckyjackson self-assigned this Aug 9, 2019
@nataled
Copy link
Contributor

nataled commented Sep 24, 2019

From the EWG discussion on this:

Partial automation possible, especially with respect to use of its terms in other ontologies and citations.

Chris M commented: The curation of usages must be manual and closely vetted by OBO Foundry.

We have usages partially curated here:

#451

Once in place the checks themselves can be automated.

Also easy to check things like GH activity. While it's conceivable that some ontologies with multiple users don't use GH it is at least a meaningful signal

@jamesaoverton
Copy link
Member

The principle says "Use of the target ontology’s term IRIs in other ontologies. This can be evidenced by linking to the other ontology that uses an ontology term IRI from this ontology" We could search for term use in other ontologies.

@cmungall
Copy link
Contributor

cmungall commented Nov 5, 2019

I agree with @beckyjackson's proposed standardization of the usages tag.

I think querying for ontology usage also makes sense. It would be fun to do a more in-depth analysis to identify "citation rings" and other artefacts.

@cmungall cmungall added the principles Issues related to Foundry principles label Nov 22, 2019
@cmungall cmungall changed the title Principle #9 automated validation Principle #9 users - automated validation Nov 22, 2019
@sbello
Copy link
Contributor

sbello commented Feb 25, 2020

Could this check also look at the 'browser' section on the OBO foundry page (https://github.com/OBOFoundry/OBOFoundry.github.io/blob/master/ontology/mp.md) The MP entry lists the MGI, RGD, and Monarch browsers and I was wondering if that should/could contribute to the plurality of users check.

@cmungall
Copy link
Contributor

We can also query eutils to look at number citations of publication(s)

We could also add this as links from the obo site, e.g. we track the uberon pmid as 22293552, can add a link to:

https://www.ncbi.nlm.nih.gov/pubmed?linkname=pubmed_pubmed_citedin&from_uid=22293552

Of course, many ontologies are under-cited, but it's a proxy

We can also do a google search for mentions of the ontology (but this can't be done via API AFAIK)

@nataled
Copy link
Contributor

nataled commented Apr 17, 2020

Also note that ontologies can be over-cited too. These are cases where the ontology was mentioned (usually as part of a "such as..." list) but not used or studied in any way. This is similar to what happens in OntoBee when it shows term usage in other ontologies, the vast majority of which are due to some wholesale import of the ontology (but the term in question was never used).

@cmungall
Copy link
Contributor

Very good point @nataled! Dare I say it a lot of this over-citation may come from papers about ontologies...

@wdduncan wdduncan added the automated validation of principles Issues for the editorial WG pertinent to the automating the validation of the Principles. label Apr 28, 2020
@cmungall
Copy link
Contributor

There are no objections to the schema @beckyjackson proposes

I would add: make examples mandatory, but multivalued. ie cardinality >= 1.

@jamesaoverton
Copy link
Member

@apmody and I are working on this in #1371. The proposed schema above is a little too simple. People are making good use of seeAlso to point to Biosharing/FAIRSharing, and of reference to link to publications about the usage. So we're going to try this schema:

usages:
  - user: required URL
    type: optional text (how the ontology is used, e.g. annotation)
    description: required text
    seeAlso: optional URL (e.g. FAIRSharing entry)
    examples:
      - url: required URL
        description: required text
    publications:
      - id: required URL (DOI, PubMed, etc.)
         title: required text
  • seeAlso is a secondary link to the user, such as a FAIR Sharing entry (these are more common than I expected)
  • examples should point to a specific page showing how the ontology is used by that user/resource
  • publications are papers about how the user uses the ontology, not specific examples of use

@matentzn
Copy link
Contributor

I like it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
attn: Editorial WG Issues pertinent to editorial activities, such as ontology reviews and principles attn: Technical WG Issues pertinent to technical activities, such as maintenance of website, PURLs, and tools automated validation of principles Issues for the editorial WG pertinent to the automating the validation of the Principles. principles Issues related to Foundry principles
Projects
None yet
Development

No branches or pull requests

7 participants