Skip to content

Commit 67a5d9a

Browse files
Add load_queryables function to pypgstac (#361)
* feat: add load_queryables function and support for collection IDs; update documentation and examples * refactor: update load_queryables to accept a list for collection IDs; adjust example and tests accordingly * feat: add delete_missing option to load_queryables; update examples and tests * docs: update load_queryables documentation to reflect new syntax and delete_missing option * feat: add index_fields option to load queryables for customizable indexing * docs: update pypgstac documentation to include index_fields option for customizable indexing
1 parent 905e05b commit 67a5d9a

File tree

8 files changed

+993
-1
lines changed

8 files changed

+993
-1
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,3 +11,5 @@ src/pypgstac/target
1111
src/pypgstac/python/pypgstac/*.so
1212
.vscode
1313
.ipynb_checkpoints
14+
.venv
15+
.pytest_cache

CHANGELOG.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,13 @@ All notable changes to this project will be documented in this file.
55
The format is based on [Keep a Changelog](http://keepachangelog.com/)
66
and this project adheres to [Semantic Versioning](http://semver.org/).
77

8+
## [Unreleased]
9+
10+
### Added
11+
12+
- Add `load_queryables` function to pypgstac for loading queryables from a JSON file
13+
- Add support for specifying collection IDs when loading queryables
14+
815
## [v0.9.5]
916

1017
### Changed

docs/src/pypgstac.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -85,6 +85,80 @@ To upsert any records, adding anything new and replacing anything with the same
8585
pypgstac load items --method upsert
8686
```
8787

88+
### Loading Queryables
89+
90+
Queryables are a mechanism that allows clients to discover what terms are available for use when writing filter expressions in a STAC API. The Filter Extension enables clients to filter collections and items based on their properties using the Common Query Language (CQL2).
91+
92+
To load queryables from a JSON file:
93+
94+
```
95+
pypgstac load_queryables queryables.json
96+
```
97+
98+
To load queryables for specific collections:
99+
100+
```
101+
pypgstac load_queryables queryables.json --collection_ids [collection1,collection2]
102+
```
103+
104+
To load queryables and delete properties not present in the file:
105+
106+
```
107+
pypgstac load_queryables queryables.json --delete_missing
108+
```
109+
110+
To load queryables and create indexes only for specific fields:
111+
112+
```
113+
pypgstac load_queryables queryables.json --index_fields [field1,field2]
114+
```
115+
116+
By default, no indexes are created when loading queryables. Using the `--index_fields` parameter allows you to selectively create indexes only for fields that require them. Creating too many indexes can degrade database performance, especially for write operations, so it's recommended to only index fields that are frequently used in queries.
117+
118+
When using `--delete_missing` with specific collections, only properties for those collections will be deleted:
119+
120+
```
121+
pypgstac load_queryables queryables.json --collection_ids [collection1,collection2] --delete_missing
122+
```
123+
124+
You can combine all parameters as needed:
125+
126+
```
127+
pypgstac load_queryables queryables.json --collection_ids [collection1,collection2] --delete_missing --index_fields [field1,field2]
128+
```
129+
130+
The JSON file should follow the queryables schema as described in the [STAC API - Filter Extension](https://github.com/stac-api-extensions/filter#queryables). Here's an example:
131+
132+
```json
133+
{
134+
"$schema": "https://json-schema.org/draft/2019-09/schema",
135+
"$id": "https://example.com/stac/queryables",
136+
"type": "object",
137+
"title": "Queryables for Example STAC API",
138+
"description": "Queryable names for the Example STAC API",
139+
"properties": {
140+
"id": {
141+
"description": "Item identifier",
142+
"type": "string"
143+
},
144+
"datetime": {
145+
"description": "Datetime",
146+
"type": "string",
147+
"format": "date-time"
148+
},
149+
"eo:cloud_cover": {
150+
"description": "Cloud cover percentage",
151+
"type": "number",
152+
"minimum": 0,
153+
"maximum": 100
154+
}
155+
},
156+
"additionalProperties": true
157+
}
158+
```
159+
160+
The command will extract the properties from the JSON file and create queryables in the database. It will also determine the appropriate property wrapper based on the type of each property and create the necessary indexes.
161+
88162
### Automated Collection Extent Updates
89163

90164
By setting `pgstac.update_collection_extent` to `true`, a trigger is enabled to automatically adjust the spatial and temporal extents in collections when new items are ingested. This feature, while helpful, may increase overhead within data load transactions. To alleviate performance impact, combining this setting with `pgstac.use_queue` is beneficial. This approach necessitates a separate process, such as a scheduled task via the `pg_cron` extension, to periodically invoke `CALL run_queued_queries();`. Such asynchronous processing ensures efficient transactional performance and updated collection extents.
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
#!/usr/bin/env python
2+
"""
3+
Example script demonstrating how to load queryables into PgSTAC.
4+
5+
This script shows how to use the load_queryables function both from the command line
6+
and programmatically.
7+
"""
8+
9+
import sys
10+
from pathlib import Path
11+
12+
# Add the parent directory to the path so we can import pypgstac
13+
sys.path.append(str(Path(__file__).parent.parent))
14+
15+
from pypgstac.pypgstac import PgstacCLI
16+
17+
18+
def load_for_specific_collections(
19+
cli, sample_file, collection_ids, delete_missing=False,
20+
):
21+
"""Load queryables for specific collections.
22+
23+
Args:
24+
cli: PgstacCLI instance
25+
sample_file: Path to the queryables file
26+
collection_ids: List of collection IDs to apply queryables to
27+
delete_missing: If True, delete properties not present in the file
28+
"""
29+
cli.load_queryables(
30+
str(sample_file), collection_ids=collection_ids, delete_missing=delete_missing,
31+
)
32+
33+
34+
def main():
35+
"""Demonstrate loading queryables into PgSTAC."""
36+
# Get the path to the sample queryables file
37+
sample_file = Path(__file__).parent / "sample_queryables.json"
38+
39+
# Check if the file exists
40+
if not sample_file.exists():
41+
return
42+
43+
# Create a PgstacCLI instance
44+
# This will use the standard PostgreSQL environment variables for connection
45+
cli = PgstacCLI()
46+
47+
# Load queryables for all collections
48+
cli.load_queryables(str(sample_file))
49+
50+
# Example of loading for specific collections
51+
load_for_specific_collections(cli, sample_file, ["landsat-8", "sentinel-2"])
52+
53+
# Example of loading queryables with delete_missing=True
54+
# This will delete properties not present in the file
55+
cli.load_queryables(str(sample_file), delete_missing=True)
56+
57+
# Example of loading for specific collections with delete_missing=True
58+
# This will delete properties not present in the file, but only for the specified collections
59+
load_for_specific_collections(
60+
cli, sample_file, ["landsat-8", "sentinel-2"], delete_missing=True,
61+
)
62+
63+
64+
if __name__ == "__main__":
65+
main()
Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
{
2+
"$schema": "https://json-schema.org/draft/2019-09/schema",
3+
"$id": "https://example.com/stac/queryables",
4+
"type": "object",
5+
"title": "Queryables for Example STAC API",
6+
"description": "Queryable names for the Example STAC API",
7+
"properties": {
8+
"id": {
9+
"description": "Item identifier",
10+
"type": "string"
11+
},
12+
"collection": {
13+
"description": "Collection identifier",
14+
"type": "string"
15+
},
16+
"datetime": {
17+
"description": "Datetime",
18+
"type": "string",
19+
"format": "date-time"
20+
},
21+
"geometry": {
22+
"description": "Geometry",
23+
"type": "object"
24+
},
25+
"eo:cloud_cover": {
26+
"description": "Cloud cover percentage",
27+
"type": "number",
28+
"minimum": 0,
29+
"maximum": 100
30+
},
31+
"platform": {
32+
"description": "Platform name",
33+
"type": "string",
34+
"enum": ["landsat-8", "sentinel-2"]
35+
},
36+
"instrument": {
37+
"description": "Instrument name",
38+
"type": "string"
39+
},
40+
"gsd": {
41+
"description": "Ground sample distance in meters",
42+
"type": "number"
43+
},
44+
"view:off_nadir": {
45+
"description": "Off-nadir angle in degrees",
46+
"type": "number"
47+
},
48+
"view:sun_azimuth": {
49+
"description": "Sun azimuth angle in degrees",
50+
"type": "number"
51+
},
52+
"view:sun_elevation": {
53+
"description": "Sun elevation angle in degrees",
54+
"type": "number"
55+
},
56+
"sci:doi": {
57+
"description": "Digital Object Identifier",
58+
"type": "string"
59+
},
60+
"created": {
61+
"description": "Date and time the item was created",
62+
"type": "string",
63+
"format": "date-time"
64+
},
65+
"updated": {
66+
"description": "Date and time the item was last updated",
67+
"type": "string",
68+
"format": "date-time"
69+
},
70+
"landcover:classes": {
71+
"description": "Land cover classes",
72+
"type": "array",
73+
"items": {
74+
"type": "string"
75+
}
76+
}
77+
},
78+
"additionalProperties": true
79+
}

0 commit comments

Comments
 (0)