You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A new metadata export format called Croissant is now available as an external metadata exporter. When enabled it replaces the Schema.org JSON-LD format in the `<head>` of dataset landing pages. For details see admin/discoverability.html#schema-org-json-ld-croissant-metadata and #10341
If you enable the Croissant metadata export format (see :ref:`external-exporters`) the ``<head>`` will show Croissant metadata instead. It looks similar, but you should see ``"cr": "http://mlcommons.org/croissant/"`` in the output.
44
+
45
+
For backward compatibility, if you enable Croissant, the older Schema.org JSON-LD format (``schema.org`` in the API) will still be available from both the web interface (see :ref:`metadata-export-formats`) and the API (see :ref:`export-dataset-metadata-api`).
46
+
47
+
The Dataverse team has been working with Google on both formats. Google has `indicated <https://github.com/mlcommons/croissant/issues/530#issuecomment-1964227662>`_ that for `Google Dataset Search <https://datasetsearch.research.google.com>`_ (the main reason we started adding this extra metadata in the ``<head>`` of dataset pages), Croissant is the successor to the older format.
Copy file name to clipboardExpand all lines: doc/sphinx-guides/source/api/native-api.rst
+5-2
Original file line number
Diff line number
Diff line change
@@ -1150,16 +1150,19 @@ The fully expanded example above (without environment variables) looks like this
1150
1150
1151
1151
.. note:: Supported exporters (export formats) are ``ddi``, ``oai_ddi``, ``dcterms``, ``oai_dc``, ``schema.org`` , ``OAI_ORE`` , ``Datacite``, ``oai_datacite`` and ``dataverse_json``. Descriptive names can be found under :ref:`metadata-export-formats` in the User Guide.
1152
1152
1153
+
.. note:: Additional exporters can be enabled, as described under :ref:`external-exporters` in the Installation Guide. To discover the machine-readable name of each exporter (e.g. ``ddi``), check :ref:`inventory-of-external-exporters` or ``getFormatName`` in the exporter's source code.
1153
1154
1154
1155
Schema.org JSON-LD
1155
1156
^^^^^^^^^^^^^^^^^^
1156
1157
1157
-
Please note that the ``schema.org`` format has changed in backwards-incompatible ways after Dataverse Software version 4.9.4:
1158
+
Please note that the ``schema.org`` format has changed in backwards-incompatible ways after Dataverse 4.9.4:
1158
1159
1159
1160
- "description" was a single string and now it is an array of strings.
1160
1161
- "citation" was an array of strings and now it is an array of objects.
1161
1162
1162
-
Both forms are valid according to Google's Structured Data Testing Tool at https://search.google.com/structured-data/testing-tool . (This tool will report "The property affiliation is not recognized by Google for an object of type Thing" and this known issue is being tracked at https://github.com/IQSS/dataverse/issues/5029 .) Schema.org JSON-LD is an evolving standard that permits a great deal of flexibility. For example, https://schema.org/docs/gs.html#schemaorg_expected indicates that even when objects are expected, it's ok to just use text. As with all metadata export formats, we will try to keep the Schema.org JSON-LD format your Dataverse installation emits backward-compatible to made integrations more stable, despite the flexibility that's afforded by the standard.
1163
+
Both forms are valid according to Google's Structured Data Testing Tool at https://search.google.com/structured-data/testing-tool . Schema.org JSON-LD is an evolving standard that permits a great deal of flexibility. For example, https://schema.org/docs/gs.html#schemaorg_expected indicates that even when objects are expected, it's ok to just use text. As with all metadata export formats, we will try to keep the Schema.org JSON-LD format your Dataverse installation emits backward-compatible to made integrations more stable, despite the flexibility that's afforded by the standard.
1164
+
1165
+
The standard has further evolved into a format called Croissant. For details, see :ref:`schema.org-head` in the Admin Guide.
Copy file name to clipboardExpand all lines: doc/sphinx-guides/source/installation/advanced.rst
+37-18
Original file line number
Diff line number
Diff line change
@@ -119,27 +119,46 @@ To activate in your Dataverse installation::
119
119
120
120
.. _external-exporters:
121
121
122
-
Installing External Metadata Exporters
123
-
++++++++++++++++++++++++++++++++++++++
122
+
External Metadata Exporters
123
+
+++++++++++++++++++++++++++
124
124
125
-
As of Dataverse Software 5.14 Dataverse supports the use of external Exporters as a way to add additional metadata
126
-
export formats to Dataverse or replace the built-in formats. This should be considered an **experimental** capability
127
-
in that the mechanism is expected to evolve and using it may require additional effort when upgrading to new Dataverse
128
-
versions.
125
+
Dataverse 5.14+ supports the configuration of external metadata exporters (just "external exporters" or "exporters" for short) as a way to add additional metadata export formats or replace built-in formats. For a list of built-in formats, see :ref:`metadata-export-formats` in the User Guide.
129
126
130
-
This capability is enabled by specifying a directory in which Dataverse should look for third-party Exporters. See
131
-
:ref:`dataverse.spi.exporters.directory`.
127
+
This should be considered an **experimental** capability in that the mechanism is expected to evolve and using it may require additional effort when upgrading to new Dataverse versions.
132
128
133
-
See :doc:`/developers/metadataexport` for details about how to develop new Exporters.
129
+
Enabling External Exporters
130
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^
134
131
135
-
An minimal example Exporter is available at https://github.com/gdcc/dataverse-exporters. The community is encourage to
136
-
add additional exporters (and/or links to exporters elsewhere) in this repository. Once you have downloaded the
137
-
dataverse-spi-export-examples-1.0.0.jar (or other exporter jar), installed it in the directory specified above, and
138
-
restarted your Payara server, the new exporter should be available.
132
+
Use the :ref:`dataverse.spi.exporters.directory` configuration option to specify a directory from which external exporters (JAR files) should be loaded.
139
133
140
-
The example dataverse-spi-export-examples-1.0.0.jar replaces the ``JSON`` export with a ``MyJSON in <locale>`` version
141
-
that just wraps the existing JSON export object in a new JSON object with the key ``inputJson`` containing the original
142
-
JSON.(Note that the ``MyJSON in <locale>`` label will appear in the dataset Metadata Export download menu immediately,
143
-
but the content for already published datasets will only be updated after you delete the cached exports and/or use a
144
-
reExport API call (see :ref:`batch-exports-through-the-api`).)
134
+
.. _inventory-of-external-exporters:
145
135
136
+
Inventory of External Exporters
137
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
138
+
139
+
Known external exporters are listed below but development takes place at https://github.com/gdcc/dataverse-exporters and you are encouraged to check there for new exporters or contribute one!
140
+
141
+
In the list below, the name of each exporter is followed by the machine-readable name in parentheses for use in APIs (see :ref:`export-dataset-metadata-api` in the API Guide).
142
+
143
+
Croissant (``croissant``)
144
+
~~~~~~~~~~~~~~~~~~~~~~~~~
145
+
146
+
`Croissant <https://github.com/mlcommons/croissant>`_ is oriented toward machine learning and exposes variable-level metadata. When enabled, it replaces the Schema.org JSON-LD shown in the ``<head>`` of a dataset page, as described under :ref:`schema.org-head` in the Admin Guide.
147
+
148
+
You can download the Croissant exporter JAR from FIXME.
149
+
150
+
The source can be found in the `"croissant" <https://github.com/gdcc/dataverse-exporters/tree/main/croissant>`_ directory of the exporters repo.
151
+
152
+
MyJSON (``dataverse_json``)
153
+
~~~~~~~~~~~~~~~~~~~~~~~~~~~
154
+
155
+
MyJSON is a minimal example exporter that demonstrates how to override a built-in metadata format. Specifically, it replaces the ``dataverse_json`` format (Dataverse's native JSON format), shown as "JSON" in the GUI with a "MyJSON in <locale>" version that just wraps the existing JSON export object in a new JSON object with the key ``inputJson`` containing the original JSON.
156
+
157
+
You can download the MyJSON exporter JAR from https://github.com/gdcc/dataverse-exporters where you should look under "prebuilt-examples" for a file called something like dataverse-spi-export-examples-x.x.x.jar.
158
+
159
+
The source can be found in the `"dataverse-spi-export-examples" <https://github.com/gdcc/dataverse-exporters/tree/main/dataverse-spi-export-examples>`_ directory of the exporters repo.
160
+
161
+
Developing New Exporters
162
+
^^^^^^^^^^^^^^^^^^^^^^^^
163
+
164
+
See :doc:`/developers/metadataexport` for details about how to develop new exporters.
Copy file name to clipboardExpand all lines: doc/sphinx-guides/source/installation/config.rst
+10-3
Original file line number
Diff line number
Diff line change
@@ -3109,12 +3109,19 @@ Can also be set via any `supported MicroProfile Config API source`_, e.g. the en
3109
3109
dataverse.spi.exporters.directory
3110
3110
+++++++++++++++++++++++++++++++++
3111
3111
3112
-
This JVM option is used to configure the file system path where external Exporter JARs can be placed. See :ref:`external-exporters` for more information.
3112
+
For some background, see :ref:`external-exporters` and :ref:`inventory-of-external-exporters`.
3113
+
3114
+
This JVM option is used to configure the file system path where external exporter JARs should be loaded from.
If this value is set, Dataverse will examine all JARs in the specified directory and will use them to add, or replace existing, metadata export formats.
3117
-
If this value is not set (the default), Dataverse will not use external Exporters.
3118
+
If this value is set, Dataverse will examine all JARs in the specified directory and will use them to add new metadata export formats or (if the machine-readable name used in :ref:`export-dataset-metadata-api` is the same) replace built-in metatadata export formats.
3119
+
3120
+
If this value is not set (the default), Dataverse will load any external exporters.
3121
+
3122
+
If you place a new JAR in this directory, you must restart Payara for Dataverse to load it.
3123
+
3124
+
If the JAR is for an exporter that replaces built-in format, you must delete the cached exports and/or use a reExport API call (see :ref:`batch-exports-through-the-api`) for the new format to be visible for existing datasets.
3118
3125
3119
3126
Can also be set via *MicroProfile Config API* sources, e.g. the environment variable ``DATAVERSE_SPI_EXPORTERS_DIRECTORY``.
Copy file name to clipboardExpand all lines: doc/sphinx-guides/source/user/dataset-management.rst
+3-1
Original file line number
Diff line number
Diff line change
@@ -25,7 +25,7 @@ For more details about what Citation and Domain Specific Metadata is supported p
25
25
Supported Metadata Export Formats
26
26
---------------------------------
27
27
28
-
Once a dataset has been published, its metadata can be exported in a variety of other metadata standards and formats, which help make datasets more discoverable and usable in other systems, such as other data repositories. On each dataset page's metadata tab, the following exports are available:
28
+
Once a dataset has been published, its metadata can be exported in a variety of other metadata standards and formats, which help make datasets more :doc:`discoverable</admin/discoverability>` and usable in other systems, such as other data repositories. On each dataset page's metadata tab, the following exports are available:
0 commit comments