You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: documentation/1.0/datacubes.md
+24-7Lines changed: 24 additions & 7 deletions
Original file line number
Diff line number
Diff line change
@@ -27,6 +27,14 @@ A vector datacube on the other hand could look like this:
27
27
A raster datacube has at least two spatial dimensions (usually named `x` and `y`) and a vector datacube has at least one geometry dimension (usually named `geometry`).
28
28
The purpose of these distinctions is simply to make it easier to describe "special" cases of datacubes, but you can also define other types such as a temporal datacube that has at least one temporal dimension (usually named `t`).
29
29
30
+
The following additional information are usually available for datacubes:
31
+
32
+
- the dimensions (see [below](#dimensions))
33
+
- a sampling method (see [below](#values-in-a-data-cube))
34
+
- a unit for the values
35
+
36
+
All these information are usually provided through the datacube metadata.
37
+
30
38
## Dimensions
31
39
32
40
A dimension refers to a certain axis of a datacube. This includes all variables (e.g. bands), which are represented as dimensions. Our exemplary raster datacube has the spatial dimensions `x` and `y`, and the temporal dimension `t`. Furthermore, it has a `bands` dimension, extending into the realm of _what kind of information_ is contained in the cube.
@@ -39,9 +47,11 @@ The following properties are usually available for dimensions:
39
47
* labels (usually exposed through textual or numerical representations, in the metadata as nominal values and/or extents)
40
48
* reference system / projection
41
49
* resolution / step size
42
-
* unit (either explicitly specified or implicitly given by the reference system)
50
+
* unit for the labels (either explicitly specified or implicitly provided by the reference system)
43
51
* additional information specific to the dimension type (e.g. the geometry types for a dimension containing geometries)
44
52
53
+
All these information are usually provided through the datacube metadata.
54
+
45
55
Here is an overview of the dimensions contained in our example raster datacube above:
46
56
47
57
| # | name | type | labels | resolution | reference system |
@@ -66,12 +76,6 @@ A dimension with geometries can consist of points, linestrings, polygons, multi
66
76
It is not possible to mix geometry types, but the single geometry type with their corresponding multi type can be combined in a dimension (e.g. points and multi points).
67
77
Empty geometries (such as GeoJSON features with a `null` geometry or GeoJSON geometries with an empty coordinates array) are allowed and can sometimes also be the result of certain vector operations such as a negative buffer.
68
78
69
-
openEO datacubes contain scalar values (e.g. strings, numbers or boolean values), with all other associated attributes stored in dimensions (e.g. coordinates or timestamps). Attributes such as the CRS or the sensor can also be turned into dimensions. Be advised that in such a case, the uniqueness of pixel coordinates may be affected. When usually, `(x, y)` refers to a unique location, that changes to `(x, y, CRS)` when `(x, y)` values are reused in other coordinate reference systems (e.g. two neighboring UTM zones).
70
-
71
-
::: tip Be Careful with Data Types
72
-
As stated above, datacubes only contain scalar values. However, implementations may differ in their ability to handle or convert them. Implementations may also not allow mixing data types in a datacube. For example, returning a boolean value for a reducer on a numerical datacube may result in an error on some back-ends. The recommendation is to not change the data type of values in a datacube unless the back-end supports it explicitly.
73
-
:::
74
-
75
79
### Applying Processes on Dimensions
76
80
77
81
Some processes are typically applied "along a dimension". You can imagine said dimension as an arrow and whatever is happening as a parallel process to that arrow. It simply means: "we focus on _this_ dimension right now".
@@ -88,6 +92,19 @@ Resampling is however costly, involves (some) data loss, and is in general not r
88
92
89
93
On such a _crs-dimensioned data cube_, several operations make perfect sense, such as `apply` or `reduce_dimension` on spectral and/or temporal dimensions. A simple reduction over the `crs` dimension, using _sum_ or _mean_ would typically not make sense. The "reduction" (removal) of the `crs` dimension that is meaningful involves the resampling/warping of all sub-cubes for the `crs` dimension to a single, common target coordinate reference system.
90
94
95
+
## Values in a datacube
96
+
97
+
openEO datacubes contain scalar values (e.g. strings, numbers or boolean values), with all other associated attributes stored in dimensions (e.g. coordinates or timestamps). Attributes such as the CRS or the sensor can also be turned into dimensions. Be advised that in such a case, the uniqueness of pixel coordinates may be affected. When usually, `(x, y)` refers to a unique location, that changes to `(x, y, CRS)` when `(x, y)` values are reused in other coordinate reference systems (e.g. two neighboring UTM zones).
98
+
99
+
::: tip Be Careful with Data Types
100
+
As stated above, datacubes only contain scalar values. However, implementations may differ in their ability to handle or convert them. Implementations may also not allow mixing data types in a datacube. For example, returning a boolean value for a reducer on a numerical datacube may result in an error on some back-ends. The recommendation is to not change the data type of values in a datacube unless the back-end supports it explicitly.
101
+
:::
102
+
103
+
Data cube values can be sampled in two different ways. The values are either area or point samples.
104
+
105
+
- Area sampling aggregates measurements over defined regions, i.e. the grid cells for raster data or polygons/lines for vector data.
106
+
- Point sampling collects data at specific locations, providing detailed information for specific points.
107
+
91
108
## Processes on Datacubes
92
109
93
110
In the following part, the basic processes for manipulating datacubes are introduced.
0 commit comments