Skip to content

Rendering Hints extension (WIP) #879

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 8 commits into from
Closed

Rendering Hints extension (WIP) #879

wants to merge 8 commits into from

Conversation

cholmes
Copy link
Contributor

@cholmes cholmes commented Aug 18, 2020

Related Issue(s): #807

Proposed Changes:

  1. Added rendering hints extension

PR Checklist:

  • This PR is made against the dev branch (all proposed changes except releases should be against dev, not master).
  • This PR has no breaking changes.
  • I have added my changes to the CHANGELOG or a CHANGELOG entry is not required.
  • This PR affects the STAC API spec, and I have opened issue/PR #XXX to track the change.

@cholmes
Copy link
Contributor Author

cholmes commented Aug 18, 2020

Ok, first draft is up. Detailed review is appreciated since I am not sure that I explained everything right. Any more details on the 'why' would also be good to add.

Still need to make an example and provide a schema. But feedback on the core ideas would be great.

Also am thinking about making the schema check zoom levels between 0 and 25? I welcome ideas on what to put for the max, but would be good to check to make sure users aren't putting in like 500 for the zoom levels.

@cholmes cholmes marked this pull request as draft August 18, 2020 20:57
Copy link

@geospatial-jeff geospatial-jeff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we include raster statistics (min/mean/max) as optional fields in this extension?

| Type Name | Description |
|-----------|-------------|
| `unknown` | Not known |
| `byte` | An unsigned 8-bit integer (common for 8-bit rgb png's) |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as far as type names go, int8 is numpy's reference to the type:

In [1]: import numpy as np

In [2]: np.int8
Out[2]: numpy.int8

In [3]: np.byte
Out[3]: numpy.int8

And rasterio has a similar idea:

In [1]: import rasterio

In [2]: src = rasterio.open("<a tiff i have>")

In [3]: src.dtypes
Out[3]: ('uint8', 'uint8', 'uint8')

I don't think it's super important but it's not an uncommon name for the type

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh no I scrolled up in gitter and now see the numpy / rasterio discussion 🙃

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really know very little about these things, so I'm happy for whatever you all think is best. I just want to have a clear list for people to pick from.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry; I thought byte was an alias for uint8, not int8. I see Numpy has ubyte as the alias for uint8. uint8 is the more common one for imagery as you saw

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I should change this to just be unit8 ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't hurt to include both I guess

Type Name Description
uint8 8-bit unsigned integer
int8 8-bit signed integer


## Item Properties fields

| Field Name | Type | Description |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(throwing this here because it seems like the most appropriate place to ask about included fields)

I don't know what the appropriate scope is for this extension, but I think the main hint I would want, if it's available, is a colormap from the tif. I'm a little bit confused about what I'd do with the min and max zoom hints.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been interested in having that type of stuff in here, though I'm far from an expert on it. I think it can be in scope, and I was anticipating it'd grow to include similar things. But I'm more than happy to include it now. Just give me more details of what that actually looks like - what is the field, what's in it, how do you explain it. Or (preferred) feel free to add a 'commitable suggestion' (or whatever it is called) on the PR and I'll add it in.

Copy link
Contributor

@kylebarron kylebarron Aug 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little bit confused about what I'd do with the min and max zoom hints

Min and max zoom hints are helpful for on-demand rendering. The max_zoom is generally discoverable from the gsd separately, but the min_zoom depends on the number of overview levels in the file (usually a Cloud-Optimized GeoTIFF).

That gets me thinking... what if a different way to describe min/max zoom is the gsd of the full-resolution data vs the gsd of the smallest overview? the main gsd is of course already stored per band. Storing the min_overview_gsd [of each band?] would allow the dynamic tiler to derive the min/max zoom levels, and would additionally support non-mercator rendering by being able to derive those zoom levels from the gsd values

Copy link
Collaborator

@m-mohr m-mohr Aug 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, for visualization it's sometimes important to know the min/max values (not zoom, actual values) for the data. Sometimes that can't be well determined from the data_type. For example, normalized_differences are usually between -1 and 1, but non of the data types really caters for that.

Additionally, I would like to use some of the fields from this extension, but don't always care about zoom levels. So having that required seems to make this less useful. But I'm not exactly sure what to require. Maybe just one of the properties? (Schema: minProperties: 1) Otherwise I'd probably just fill the min/max values for zoom. They should be mentioned in the docs, I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

min_overview_gsd does seem a bit cleaner. Is there an easy way to calculate that? Would want to give people some reference of how to get it.

min/max values sounds good to add. I'm for it. What should the field name be?

And minProperties: 1 sounds good - I only knew the one use case, but if you have a use case that involves not using them then I'm happy to not make that one required.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Responded in main thread.

@kylebarron
Copy link
Contributor

Also am thinking about making the schema check zoom levels between 0 and 25?

@vincentsarago uses 30 as the max zoom in his mosaic code.
https://github.com/developmentseed/cogeo-mosaic/blob/157d31972398d8f1d8e0a13e8290e8ddb6f1a33a/cogeo_mosaic/mosaic.py#L61-L62

@m-mohr
Copy link
Collaborator

m-mohr commented Aug 19, 2020

@DanielJDufour I think this one could be interesting for you and I think you have some expertise to share from your work on geotiff-stats, geotiff.io, georaster etc. Could you have a look, please?

@m-mohr
Copy link
Collaborator

m-mohr commented Aug 19, 2020

@cholmes Let me know whenever you are finished and need a JSON Schema ;-)

@DanielJDufour
Copy link

DanielJDufour commented Aug 19, 2020

Hello. I apologize if my suggestions are already covered by other extensions (I'm still learning about what is out there). I submit the following items for your consideration:

minimum and maximum pixel values for each band of each asset

This is useful for rendering single-band images without a color palette. It could also help render a single-band from a Landsat scene. I should note that GDAL can embed this information as XML into a GeoTIFF's metadata. However, I'm not sure if you can do this for other types of gridded data (e.g., JPG2000). Maybe someone else can provide information on that. And JSON is easier to work with if you are doing anything client-side in the browser.

more projection information

It's probably because of my lack of knowledge about STAC that I suggest this. But is there currently a way to grab the projection information for an asset? Ideally we could have the wkt and maybe proj4js string easily available. In order to render a GeoTIFF asset in a projection other than the one it is in would require this information. However, my understanding is that this has been solved in Python already via GDAL/RasterIO, but the JS ecosystem doesn't yet have a solution for converting a GeoTIFFs key to a proj4js or wkt string. I started a little of this work here, but there's a long way to go. This is also currently (imho) the biggest blocker to creating an OpenLayers plugin (ol-geotiff) that can handle and display any GeoTIFF (and doesn't rely on external services). It's also the most common issue that comes up with GeoRasterLayer

color palette

(I think someone mentioned this already). The color palette is basically the mapping of a pixel value to an RGBA Color. GeoTIFFs often store color palette imagery in a somewhat compressed format where you don't store all the mappings, but instead the scaled step between pixel values. geotiff-palette basically expands this info into a simple array of RGBA values where the index number in the array refers to its pixel value. For 8-bit imagery, there would be 256 [R,G,B,A] arrays, but for imagery with a larger range of values (e.g. float/decimal values), you could have a very large palette, so it probably makes sense to store this information in the compressed format. Open to correction because I don't use palettes that much.

histograms

Similar to palettes the size of the histogram depends on the number of uniques values found in the raster. GDAL will bin/group this data for you, which will shrink the space required to store this information. However, it's often useful to have direct access to the raw un-binned counts for each pixel value, especially if the renderer would like to bin it with different bin sizes. The raw data could get very large if used on raster with more than 8-bits per pixel.

tile-level information

It would be awesome if there was a standard around storing not just file-level statistics, but the stats for each tile. It could be helpful for customizing/stretching the rendering relative to tiles that are actually being displayed. Sometimes people will want to apply a threshold (i.e., only show pixel values over 300), so having access to the tile-level minimum, could help us avoid extra calls for tiles without this information. Here's an example of displaying a GeoTIFF that displays the areas where tropical fruit is grown in Puerto Rico: https://geotiff.github.io/georaster-layer-for-leaflet-example/examples/thresholding.html

mask

This might have been mentioned earlier and I've seen it used in some of the awesome tilers out there. It might be useful in some situations to have access to a polygon representing how the data should be masked (i.e. where the no-data values are). It can help avoid extra calls for pixels in an area that is only no-data values.

range of pixel values for each band

This could be optional or not included in the extension because it's basically derived by subtracting the minimum pixel value from the maximum pixel value. However, it could be useful and remove one (albeit simple) step. Here's an example of the range (along with min and max) being used: https://github.com/GeoTIFF/georaster-layer-for-leaflet/blob/master/georaster-layer-for-leaflet.js#L377

official rendering functions

I'm not sure about this one, but thought I should share it in case it sparks conversation. I see it's usefulness but also its drawbacks. It could be interesting if there was a way that data holders can provide validated band arithmetic functions. I've personally found it difficult to come up these equations because of all the caveats. For example, although NDVI is a rather simple equation on the face of it, in practice, one often has to add in caveats like making sure water displays correctly. We could also consider using a generic language like the one proposed by stac-expr, but I've heard its more useful for people to have code in the language they would actually run the arithmetic in, like Python or JavaScript.
For example, we could have a JSON structure look like:

"expressions": {
    "ndvi": {
        "js": "results = (nir - red) / (nir + red); return result <= 0.1 ? 'blue' : result >= 0.8 ? 'black' : result",
        "python": ...
   }
}

Although these expressions can get long and complicated, that's precisely why I think it could be useful for people who simply want to display NDVI without having to do the Math themselves. However, it would increase the maintenance cost and what happens when an equation gets updated? Would this change any applications depending on the previous rendering equation? Is that a good or bad thing? What do you think @m-mohr ? Would love your thoughts on this.

Supervised Classification and AI model output

This is probably out of scope, but if we really wanted to go overboard, we could include include the results of supervised classification. For example, we could store the range for water values:

"classification_results": {
    "corn": "0.4 < b01 < 0.9",
    "water": "(nir - red) / (nir + red) <= 0.1"
}

I'd also like to invite @rowanwins to offer some feedback. He's done a lot of good work with GeoTIFFs and might also have some good suggestions to add :-) He's mentioned standard deviation and quantiles being useful before (source).

Apologies again for the lengthy post. I think there's a clear use case for min/max, projection info, color palette, histogram, tile-level info, and mask. However, some of this might already be covered by other extensions. I'm also unsure about whether it makes sense to include range, band arithmetic, and classification output in the STAC extension because it would increase the scope of it and might make sense to separate concerns more.

Looking forward to your thoughts and feedback.

@cholmes
Copy link
Contributor Author

cholmes commented Aug 19, 2020

Thanks for all the extensive comments @DanielJDufour! Really appreciate it. A few responses:

minimum and maximum pixel values for each band of each asset

@m-mohr suggested this too. Let's get it in. min_pixel_value and max_pixel_value as two fields? And what's the range of values we should accept here, and what type? For the json schema, and to explain.

more projection information

https://github.com/radiantearth/stac-spec/tree/master/extensions/projection is pretty extensive - let me know if there's anything that's needed for you, but I think it should be sufficient. We also recently added transform & shape, to enable things like VRT's without having to open the files.

color palette

This sounds like more of an asset than a property? Like it's a small file you reference (or even be embedded in the geotiff?), not something you'd embed in JSON? Or do I have it wrong? I think that'd just be an asset. I don't know that other extensions yet specify assets, but it seems like it would make sense to me.

histogram

This also sounds interesting to me, and also sounds like more of an asset to reference? I'd love an example with a color palette and a histogram, so we can show directly what it would look like for people, and provide a best practice for this stutff.

tile-level information

This one feels outside the scope of STAC to me? Unless maybe if you're using the tiled asset extension

mask

Feels like another asset? Again, if someone can get me an example mask I can probably make an example with it, and recommend it.

range of pixel values for each band

I'm inclined to just leave this for people to get by subtracting min/max. So that there's not a situation where the values don't agree with one another, like someone changes one but doesn't update the other.

official rendering functions

I'm into this idea in general - like getting to standard rendering functions. At Planet we're experimenting with this, and I would have loved a standard to point at. In the interest of 'small pieces loosely coupled' I'm inclined to aim for a standard on this that stands alone / isn't tied to STAC. But that could be referenced by STAC. Since I think there are people who would want to use this without having to understand STAC.

Supervised Classification and AI model output

Similar feelings to the official rendering functions - cool to have, let's do in its own spec.

@kylebarron
Copy link
Contributor

A few thoughts

Min/Max gsd instead of Min/Max zoom

After some thought and discussion with @geospatial-jeff and @vincentsarago, I'd like to propose that instead of minzoom and maxzoom, we have maxgsd and mingsd, where gsd already exists in STAC and is defined as "Ground Sample Distance at the sensor", and the other is the gsd that corresponds to the smallest overview level.

Using gsd instead of zoom is more applicable to a wider range of applications because gsd is essentially using the "local projection" (i.e. UTM in meters) rather than Web Mercator at the equator. This means that for applications that want to visualize the STAC item not in Web Mercator, gsd will be more helpful.

Additionally, converting gsd to zoom is very simple (5 lines of code) in a range of projections. Web Mercator; Arbitrary TileMatrixSet.

minimum and maximum pixel values for each band of each asset

If you know the dtype is uint16; then you know the max value of each band is 65535, no? Are there real-world examples of the dtype set to uint16 but not all of the range is used?

more projection information

This is already covered with the proj extension I believe. In Sentinel 2 COG STAC this exists per band:

"proj:shape": [1830, 1830],
"proj:transform": [
  60.0,
  0.0,
  600000.0,
  0.0,
  -60.0,
  1800000.0,
  0.0,
  0.0,
  1.0
]

mask

At least with Sentinel 2 COG STAC, the primary STAC geometry already includes only the valid portion of the image.

official rendering functions

Supervised Classification and AI model output

IMO a black hole and out of scope for STAC. I would argue STAC should only be for objective data and these both seem very subjective. Hence this should be a client choice. Additionally, it seems to be collection-level information that doesn't vary at the item level.

tile-level information

If I understand you correctly, you're suggesting info for each internal tile of each overview? This seems like a ton of data and would make each STAC item huge. For a 640MB NAIP COG, it has 908 tiles, so storing tile-level information for each one would make the STAC json at least an order of magnitude larger. And it would make it much harder to store an entire collection of STACs with this info per tile.

@kylebarron
Copy link
Contributor

color palette

This sounds like more of an asset than a property? Like it's a small file you reference (or even be embedded in the geotiff?), not something you'd embed in JSON? Or do I have it wrong? I think that'd just be an asset. I don't know that other extensions yet specify assets, but it seems like it would make sense to me.

I think these are generally embedded within the GeoTIFF, but I could be wrong.

@cholmes
Copy link
Contributor Author

cholmes commented Aug 19, 2020

Should we include raster statistics (min/mean/max) as optional fields in this extension?

@geospatial-jeff - I'm into this in theory, but I struggle with how to implement. I'm guessing you mean like the results of gdal with -stats:

STATISTICS_MAXIMUM=255
STATISTICS_MEAN=28.054506440187
STATISTICS_MINIMUM=0
STATISTICS_STDDEV=42.88762832552
STATISTICS_VALID_PERCENT=100

I could see it working if all data distributed one band per asset, since then you can just have those fields on each asset. But many have one asset with several bands, so we'd need like some nested json object to give the values for each band. And that just seems to get pretty messy.

Seems like we should just get this written up as an 'extension' on the COG spec. Like it feels much cleaner at the tiff tag level. I've been meaning to do that since @mojodna came up with the idea.

@cholmes
Copy link
Contributor Author

cholmes commented Aug 19, 2020

Min/Max gsd instead of Min/Max zoom

After some thought and discussion with @geospatial-jeff and @vincentsarago, I'd like to propose that instead of minzoom and maxzoom, we have maxgsd and mingsd, where gsd already exists in STAC and is defined as "Ground Sample Distance at the sensor", and the other is the gsd that corresponds to the smallest overview level.

I'm on board. Seems like we just need min_gsd for this extension then? And a perhaps a recommendation on how to use gsd to calculate the max? I'll take a crack at it.

minimum and maximum pixel values for each band of each asset

If you know the dtype is uint16; then you know the max value of each band is 65535, no? Are there real-world examples of the dtype set to uint16 but not all of the range is used?

I had thought someone mentioned this in the thread, but now I can't seem to find it. But I think the real-world example is NDVI output, that is often just -1 to 1. Unless I'm off on that one. That'd obviously be a float, so I don't know an uint16.

mask

At least with Sentinel 2 COG STAC, the primary STAC geometry already includes only the valid portion of the image.

Ah, true. We should probably call this out specifically. I think in the label extension we do, since often it just uses a subset of an overall valid image. But perhaps we call it out for EO as well, or even make it a general recommendation - the polygon you use should be the valid portion, not including black fill, etc. I think we'd want extensions to be able to change that behavior if there's some reason to, but could be good to make clear that should be the default. Definitely seems more useful.

@kylebarron
Copy link
Contributor

kylebarron commented Aug 19, 2020

Min/Max gsd instead of Min/Max zoom

After some thought and discussion with @geospatial-jeff and @vincentsarago, I'd like to propose that instead of minzoom and maxzoom, we have maxgsd and mingsd, where gsd already exists in STAC and is defined as "Ground Sample Distance at the sensor", and the other is the gsd that corresponds to the smallest overview level.

I'm on board. Seems like we just need min_gsd for this extension then? And a perhaps a recommendation on how to use gsd to calculate the max? I'll take a crack at it.

Overviews have associated decimations, which I believe is defined as the number of times fewer pixels that overview has.

Running rio cogeo info on a GeoTIFF from the naip-analytic S3 bucket shows:

    Id      Size           BlockSize     Decimation
    0       10478x12642    512x512       0
    1       5239x6321      128x128       2
    2       2620x3161      128x128       4
    3       1310x1581      128x128       8
    4       655x791        128x128       16
    5       328x396        128x128       32

Then given that the gsd for the full-resolution image is 0.6 meters, the gsd of the level-5 overview I presume would be 0.6 * 32 = 19.2 meters. min might not be the best prefix since the value is necessarily larger than gsd, because each pixel of the overview covers a larger area.

minimum and maximum pixel values for each band of each asset

If you know the dtype is uint16; then you know the max value of each band is 65535, no? Are there real-world examples of the dtype set to uint16 but not all of the range is used?

I had thought someone mentioned this in the thread, but now I can't seem to find it. But I think the real-world example is NDVI output, that is often just -1 to 1. Unless I'm off on that one. That'd obviously be a float, so I don't know an uint16.

This seems strange to me. I understand that NDVI output ranges from -1 to 1, but why would you store the literal values in a GeoTIFF as -1 to 1? You'd be either losing precision or making a larger file size. If the source data comes as uint16, it would seem much better to store the NDVI also as uint16, where you wouldn't lose any precision and still store each pixel with 16 bits. Then 0 in the file maps to a logical -1 and 65535 in the file maps to a logical 1.

In any case, this isn't something I feel strongly about either way, and it's not too much space to store min and max.

@cholmes
Copy link
Contributor Author

cholmes commented Aug 19, 2020

min might not be the best prefix since the value is necessarily larger than gsd, because each pixel of the overview covers a larger area.

Hrm good point. overview_gsd or something that? max_overview_gsd?

@geospatial-jeff
Copy link

Are there real-world examples of the dtype set to uint16 but not all of the range is used?

Another good real world example is satellite imagery is often captured between 11-14 bits but stored in a 16 bit image.

@cholmes
Copy link
Contributor Author

cholmes commented Aug 20, 2020

As I sat down to try to add the new things discussed it occurs to me that the min/max values per band is the same situation as the stats - we need some sort of per band construct.

Oh wait, we just decided that the 'bands' information should be included in every item. So I think we could just add on fields to the band. That would be an extension of an extension, which I think is ok? But not sure if that should be a new extension or done in this one.

I think for now I'm going to consider those out of scope, but file new issues for them, so that we can get this extension out the door.

@cholmes
Copy link
Contributor Author

cholmes commented Aug 20, 2020

Ok, so I think this draft is getting close to ready. Somehow with all the great suggestions we ended up with less fields. I'll file a ticket on the min/max values, it'll just be trickier to integrate with eo bands.

@kylebarron @geospatial-jeff @vincentsarago - a final review would be great. Also if one of you could help me get together a real world example. Like could you figure out the overview gsd of one planet stac sample images? And the data type?

@m-mohr - a schema would be great at this point too, hopefully not much will change.

@m-mohr
Copy link
Collaborator

m-mohr commented Aug 24, 2020

@cholmes In general this looks good, but (1) it should have at least one example and (2) I'm not sure I follow you on min/max values. What's the exact reson to not include them here? It feels like it exactly fits into this extension and we named a couple use cases already. min/max are likely closely bound to eo:bands, but could also apply to data without bands, like SAR or so. So I'd actually define them on the same level as data type, but also allow them in bands, I guess. We may re-use the Stats Objects from Collections, which is extensible for mean, median or whatever...

@DanielJDufour Thanks! Chris pretty much summarized my thoughts, too. I'd love to see a color palette extension though. Many of the other things are either already covered or will be covered soon. Great!

I'll come up with a Schema soon.

| render:overview_max_gsd | number | The maximum Ground Sample Distance represented in an overview. This should be the GSD of the highest level overview, generally of a [Cloud Optimized GeoTIFF](http://cogeo.org), but should work with any format. |
| render:data_type | string | The data `type` (float, int, complex, etc) to let the renderer apply any needed rescaling up front. The full set of options is listed below. |

**render:overview_max_gsd**: This field helps renderers of understand what zoom levels they can efficiently show. It is

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**render:overview_max_gsd**: This field helps renderers of understand what zoom levels they can efficiently show. It is
**render:overview_max_gsd**: This field helps renderers understand what zoom levels they can efficiently show. It is

@cholmes
Copy link
Contributor Author

cholmes commented Aug 25, 2020

@m-mohr: 1) I'll get an example in - wanted to get a real example, and just saw @kylebarron provided the info I need to do it. Will try to get it in within a week (on vacation and have a ton of sprint organizing to do). 2) Midway through writing I realized that it was not as hard as I thought, since we use the bands object. But I think the one thing I'm less clear on is 'extending an extension', since bands isn't in core. Just not sure if we do it all in one extension, where some of the fields don't need to extend eo.

@kylebarron - thanks! This is helpful.

It does feel like we should have a good 'best practices' section on this stuff, with how to figure out quadkeys, zoom levels, etc. And ideally those link to code in various languages eventually.

@m-mohr
Copy link
Collaborator

m-mohr commented Aug 25, 2020

I think it shouldn't only live in bands, because there's more data then EO.

@cholmes
Copy link
Contributor Author

cholmes commented Aug 31, 2020

@m-mohr - how do we specify it then? Just at the asset level? And then note that assets can use it in their bands objects if desired?

@m-mohr
Copy link
Collaborator

m-mohr commented Aug 31, 2020

There's likely not an easy answer to this. It all depends on the underlying data structure / file format. I guess it needs to be available at eo:bands and in assets, assets being the primary place where it lives in, but in future extensions it could also live in other places. Like in data cubes for example it would also be useful and there's likely more...

Maybe we should just make this extension work for assets and then say in the EO and data cube extension that the rendering extension can be used in their Band / Dimension Objects?

@cholmes
Copy link
Contributor Author

cholmes commented Aug 31, 2020

Maybe we should just make this extension work for assets and then say in the EO and data cube extension that the rendering extension can be used in their Band / Dimension Objects?

Yeah, that's probably a good way to do it. Specify here, and then in EO it can mention that rendering hints extension can be used at the bands level.

Also should we specify at the item property level and say that it's usually just used at the asset level? So that items with only one asset can use it? Or is it easier to just keep it at the asset level?

@m-mohr
Copy link
Collaborator

m-mohr commented Aug 31, 2020

I'd go with the properties level, I guess. For summary reasons and we did that for most fields recently and just allow everything also in assets. But no strong preferrence from my side.

@m-mohr
Copy link
Collaborator

m-mohr commented Sep 1, 2020

Should we add the min/max values per data type to the table? Then these would be default values for the min/max fields, if not provided, right?

@cholmes
Copy link
Contributor Author

cholmes commented Sep 1, 2020

Should we add the min/max values per data type to the table? Then these would be default values for the min/max fields, if not provided, right?

Yes, great idea.

@m-mohr
Copy link
Collaborator

m-mohr commented Dec 14, 2020

Any progress on this? I'd like to re-use it in CARD4L... Thinking whether the data_type would actually make sense in a "file" extension, see #921 (comment).

@m-mohr
Copy link
Collaborator

m-mohr commented Dec 16, 2020

#934 also proposes to add the data type field, so we may remove it here. Instead we could add min/max values.

@cholmes
Copy link
Contributor Author

cholmes commented Dec 16, 2020

Yeah, just noticed the data type field in the file extension. I agree it makes more sense there, it's more general. So let's remove it here.

As for progress, I'm hoping to work on it sometime soon - paused all my stac core spec work in favor of API.

@kylebarron
Copy link
Contributor

Anything I can do to help move this forward? We're removing the data type from this ext since it's now in the file extension, and adding data range?

One question about data range: would it represent the "global" minimum and maximum across the dataset or the "local" minimum and maximum within that specific scene? For example, Sentinel 2 data that's 12 bits would presumably have a global min/max of 0-4095, but if the specific scene isn't fully saturated, the scene's min and max might be a smaller range.

@cholmes
Copy link
Contributor Author

cholmes commented Jan 26, 2021

@kylebarron if you want to work on this that'd be awesome.

Though I think this particular extension ends up being pretty minimal. I think the path ahead is:

  • data type is in file extension, so no need to have it here.
  • min / max should go in 'stats' extension - raster stats metadata #906 - as it has uses other than just rendering. No one has started that.
  • This extension retains the overview_max_gsd field. And perhaps just that?

So the main work is probably on the 'stats' extension, getting it written up and into a PR, and making sure the structure it proposes works with the different ways people organize files.

One question about data range: would it represent the "global" minimum and maximum across the dataset or the "local" minimum and maximum within that specific scene?

In my mind this is mostly per item, so yeah, it's the 'local' one. You might put the global in a 'summary' at the collection level.

@matthewhanson
Copy link
Collaborator

As extensions now live in the stac-extensions organization, this PR is being closed. Follow instructions here to create a repository for the extension:
https://github.com/stac-extensions/stac-extensions.github.io#adding-a-new-extension

@emmanuelmathot
Copy link
Collaborator

Progress on this at https://github.com/stac-extensions/raster

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants