1
+ <!-- omit from toc -->
1
2
# STAC Task (stac-task)
2
3
3
4
[ ![ Build Status] ( https://github.com/stac-utils/stac-task/workflows/CI/badge.svg?branch=main )] ( https://github.com/stac-utils/stac-task/actions/workflows/continuous-integration.yml )
6
7
[ ![ codecov] ( https://codecov.io/gh/stac-utils/stac-task/branch/main/graph/badge.svg )] ( https://codecov.io/gh/stac-utils/stac-task )
7
8
[ ![ License] ( https://img.shields.io/badge/License-Apache%202.0-blue.svg )] ( https://opensource.org/licenses/Apache-2.0 )
8
9
10
+ - [ Quickstart for Creating New Tasks] ( #quickstart-for-creating-new-tasks )
11
+ - [ Task Input] ( #task-input )
12
+ - [ ProcessDefinition Object] ( #processdefinition-object )
13
+ - [ UploadOptions Object] ( #uploadoptions-object )
14
+ - [ path\_ template] ( #path_template )
15
+ - [ collections] ( #collections )
16
+ - [ tasks] ( #tasks )
17
+ - [ TaskConfig Object] ( #taskconfig-object )
18
+ - [ Full Process Definition Example] ( #full-process-definition-example )
19
+ - [ Migration] ( #migration )
20
+ - [ 0.4.x -\> 0.5.x] ( #04x---05x )
21
+ - [ Development] ( #development )
22
+ - [ Contributing] ( #contributing )
23
+
9
24
This Python library consists of the Task class, which is used to create custom tasks based
10
25
on a "STAC In, STAC Out" approach. The Task class acts as wrapper around custom code and provides
11
26
several convenience methods for modifying STAC Items, creating derived Items, and providing a CLI.
@@ -17,7 +32,7 @@ This library is based on a [branch of cirrus-lib](https://github.com/cirrus-geo/
17
32
``` python
18
33
from typing import Any
19
34
20
- from stactask import Task
35
+ from stactask import Task, DownloadConfig
21
36
22
37
class MyTask (Task ):
23
38
name = " my-task"
@@ -30,7 +45,10 @@ class MyTask(Task):
30
45
item = self .items[0 ]
31
46
32
47
# download a datafile
33
- item = self .download_item_assets(item, assets = [' data' ])
48
+ item = self .download_item_assets(
49
+ item,
50
+ config = DownloadConfig(include = [' data' ])
51
+ )
34
52
35
53
# operate on the local file to create a new asset
36
54
item = self .upload_item_assets_to_s3(item)
@@ -41,32 +59,32 @@ class MyTask(Task):
41
59
42
60
## Task Input
43
61
44
- | Field Name | Type | Description |
45
- | ------------- | ---- | ----------- |
46
- | type | string | Must be FeatureCollection |
47
- | features | [ Item] | A list of STAC ` Item ` |
48
- | process | ProcessDefinition | A Process Definition |
62
+ | Field Name | Type | Description |
63
+ | ---------- | ----------------- | -------------- ----------- |
64
+ | type | string | Must be FeatureCollection |
65
+ | features | [ Item] | A list of STAC ` Item ` |
66
+ | process | ProcessDefinition | A Process Definition |
49
67
50
68
### ProcessDefinition Object
51
69
52
70
A STAC task can be provided additional configuration via the 'process' field in the input
53
71
ItemCollection.
54
72
55
- | Field Name | Type | Description |
56
- | ------------- | ---- | ----------- |
57
- | description | string | Optional description of the process configuration |
58
- | upload_options | UploadOptions | Options used when uploading assets to a remote server |
59
- | tasks | Map<str, Map> | Dictionary of task configurations. A List of [ task configurations] ( #taskconfig-object ) is supported for backwards compatibility reasons, but a dictionary should be preferred. |
73
+ | Field Name | Type | Description |
74
+ | -------------- | ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- ----------- |
75
+ | description | string | Optional description of the process configuration |
76
+ | upload_options | UploadOptions | Options used when uploading assets to a remote server |
77
+ | tasks | Map<str, Map> | Dictionary of task configurations. A list of [ task configurations] ( #taskconfig-object ) is supported for backwards compatibility reasons, but a dictionary should be preferred. |
60
78
61
79
#### UploadOptions Object
62
80
63
- | Field Name | Type | Description |
64
- | ------------- | ---- | ----------- |
65
- | path_template | string | ** REQUIRED** A string template for specifying the location of uploaded assets |
66
- | public_assets | [ str] | A list of asset keys that should be marked as public when uploaded |
67
- | headers | Map<str, str> | A set of key, value headers to send when uploading data to s3 |
68
- | collections | Map<str, str> | A mapping of output collection name to a JSONPath pattern (for matching Items) |
69
- | s3_urls | bool | Controls if the final published URLs should be an s3 (s3://* bucket* /* key* ) or https URL |
81
+ | Field Name | Type | Description |
82
+ | ------------- | ------------- | ---------------------------------------------------------------------------- ----------- |
83
+ | path_template | string | ** REQUIRED** A string template for specifying the location of uploaded assets |
84
+ | public_assets | [ str] | A list of asset keys that should be marked as public when uploaded |
85
+ | headers | Map<str, str> | A set of key, value headers to send when uploading data to s3 |
86
+ | collections | Map<str, str> | A mapping of output collection name to a JSONPath pattern (for matching Items) |
87
+ | s3_urls | bool | Controls if the final published URLs should be an s3 (s3://* bucket* /* key* ) or https URL |
70
88
71
89
##### path_template
72
90
@@ -121,10 +139,10 @@ would have `param2=value2` passed. If there were a `task-b` to be run it would n
121
139
122
140
A Task Configuration contains information for running a specific task.
123
141
124
- | Field Name | Type | Description |
125
- | ------------- | ---- | ----------- |
126
- | name | str | ** REQUIRED** Name of the task |
127
- | parameters | Map<str, str> | Dictionary of keyword parameters that will be passed to the Tasks ` process ` function |
142
+ | Field Name | Type | Description |
143
+ | ---------- | ------------- | ------------------------------------------------------------------------- ----------- |
144
+ | name | str | ** REQUIRED** Name of the task |
145
+ | parameters | Map<str, str> | Dictionary of keyword parameters that will be passed to the Tasks ` process ` function |
128
146
129
147
## Full Process Definition Example
130
148
@@ -147,6 +165,83 @@ Process definitions are sometimes called "Payloads":
147
165
}
148
166
```
149
167
168
+ ## Migration
169
+
170
+ ### 0.4.x -> 0.5.x
171
+
172
+ In 0.5.0, the previous use of fsspec to download Item Assets has been replaced with
173
+ the stac-asset library. This has necessitated a change in the parameters
174
+ that the download methods accept.
175
+
176
+ The primary change is that the Task methods ` download_item_assets ` and
177
+ ` download_items_assets ` (items plural) now accept fewer explicit and implicit
178
+ (kwargs) parameters.
179
+
180
+ Previously, the methods looked like:
181
+
182
+ ``` python
183
+ def download_item_assets (
184
+ self ,
185
+ item : Item,
186
+ path_template : str = " ${collection} /${id} " ,
187
+ keep_original_filenames : bool = False ,
188
+ ** kwargs : Any,
189
+ ) -> Item:
190
+ ```
191
+
192
+ but now look like:
193
+
194
+ ``` python
195
+ def download_item_assets (
196
+ self ,
197
+ item : Item,
198
+ path_template : str = " ${collection} /${id} " ,
199
+ config : Optional[DownloadConfig] = None ,
200
+ ) -> Item:
201
+ ```
202
+
203
+ Similarly, the ` asset_io ` package methods were previously:
204
+
205
+ ``` python
206
+ async def download_item_assets (
207
+ item : Item,
208
+ assets : Optional[list[str ]] = None ,
209
+ save_item : bool = True ,
210
+ overwrite : bool = False ,
211
+ path_template : str = " ${collection} /${id} " ,
212
+ absolute_path : bool = False ,
213
+ keep_original_filenames : bool = False ,
214
+ ** kwargs : Any,
215
+ ) -> Item:
216
+ ```
217
+
218
+ and are now:
219
+
220
+ ``` python
221
+ async def download_item_assets (
222
+ item : Item,
223
+ path_template : str = " ${collection} /${id} " ,
224
+ config : Optional[DownloadConfig] = None ,
225
+ ) -> Item:
226
+ ```
227
+
228
+ Additionally, ` kwargs ` keys were set to pass configuration through to fsspec. The most common
229
+ parameter was ` requester_pays ` , to set the Requester Pays flag in AWS S3 requests.
230
+
231
+ Many of these parameters can be directly translated into configuration passed in a
232
+ ` DownloadConfig ` object, which is just a wrapper over the ` stac_asset.Config ` object.
233
+
234
+ Migration of these various parameters to ` DownloadConfig ` are as follows:
235
+
236
+ - ` assets ` : set ` include `
237
+ - ` requester_pays ` : set ` s3_requester_pays ` = True
238
+ - ` keep_original_filenames ` : set ` file_name_strategy ` to
239
+ ` FileNameStrategy.FILE_NAME ` if True or ` FileNameStrategy.KEY ` if False
240
+ - ` overwrite ` : set ` overwrite `
241
+ - ` save_item ` : none, Item is always saved
242
+ - ` absolute_path ` : none. To create or retrieve the Asset hrefs as absolute paths, use either
243
+ ` Item#make_all_asset_hrefs_absolute() ` or ` Asset#get_absolute_href() `
244
+
150
245
## Development
151
246
152
247
Clone, install in editable mode with development requirements, and install the ** pre-commit** hooks:
0 commit comments