You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When doing HTTP requests, the connector expects the records to be part of the response JSON body. The "Record selector" field of the stream needs to be set to the property of the response object that holds the records.
23
+
When doing HTTP requests, the connector expects the records to be part of the response JSON body. The "Record Selector" component of the stream can be used to configure how records should be extracted from the response body.
24
+
25
+
The Record Selector component contains a few different levers to configure this extraction:
26
+
- Field Path
27
+
- Record Filter
28
+
- Cast Record Fields to Schema Types
29
+
30
+
These will be explained below.
24
31
32
+
### Field Path
33
+
The Field Path feature lets you define a path into the fields of the response to point to the part of the response which should be treated as the record(s).
34
+
35
+
Below are a few different examples of what this can look like depending on the API.
36
+
37
+
#### Top-level key pointing to array
25
38
Very often, the response body contains an array of records along with some suplementary information (for example meta data for pagination).
26
39
27
40
For example the ["Most popular" NY Times API](https://developer.nytimes.com/docs/most-popular-product/1/overview) returns the following response body:
@@ -50,9 +63,9 @@ For example the ["Most popular" NY Times API](https://developer.nytimes.com/docs
50
63
}`}
51
64
</pre>
52
65
53
-
**Setting the record selector to `results`** selects the array with the actual records, everything else is discarded.
66
+
In this case, **setting the Field Path to `results`** selects the array with the actual records, everything else is discarded.
54
67
55
-
### Nested objects
68
+
####Nested array
56
69
57
70
In some cases the array of actual records is nested multiple levels deep in the response, like for the ["Archive" NY Times API](https://developer.nytimes.com/docs/archive-product/1/overview):
58
71
@@ -77,9 +90,9 @@ In some cases the array of actual records is nested multiple levels deep in the
77
90
}`}
78
91
</pre>
79
92
80
-
**Setting the record selector needs to be set to "`response`,`docs`"** selects the nested array.
93
+
In this case, **setting the Field Path to `response`,`docs`** selects the nested array.
81
94
82
-
### Root array
95
+
####Root array
83
96
84
97
In some cases, the response body itself is an array of records, like in the [CoinAPI API](https://docs.coinapi.io/market-data/rest-api/quotes):
85
98
@@ -103,11 +116,11 @@ In some cases, the response body itself is an array of records, like in the [Coi
103
116
<b>{`]`}</b>
104
117
</pre>
105
118
106
-
In this case, **the record selector can be omitted** and the whole response becomes the list of records.
119
+
In this case, **the Field Path can be omitted** and the whole response becomes the list of records.
107
120
108
-
### Single object
121
+
####Single object
109
122
110
-
Sometimes, there is only one record returned per request from the API. In this case, the record selector can also point to an object instead of an array which will be handled as the only record, like in the case of the [Exchange Rates API](https://exchangeratesapi.io/documentation/#historicalrates):
123
+
Sometimes, there is only one record returned per request from the API. In this case, the field path can also point to an object instead of an array which will be handled as the only record, like in the case of the [Exchange Rates API](https://exchangeratesapi.io/documentation/#historicalrates):
111
124
112
125
<pre>
113
126
{`{
@@ -128,11 +141,11 @@ Sometimes, there is only one record returned per request from the API. In this c
128
141
}`}
129
142
</pre>
130
143
131
-
In this case, a record selector of `rates` will yield a single record which contains all the exchange rates in a single object.
144
+
In this case, **setting the Field Path to `rates`** will yield a single record which contains all the exchange rates in a single object.
132
145
133
-
### Fields nested in arrays
146
+
####Fields nested in arrays
134
147
135
-
In some cases, records are selected in multiple branches of the response object (for example within each item of an array):
148
+
In some cases, records are located in multiple branches of the response object (for example within each item of an array):
136
149
137
150
```
138
151
@@ -153,7 +166,7 @@ In some cases, records are selected in multiple branches of the response object
153
166
154
167
```
155
168
156
-
In this case a record selector with a placeholder `*` selects all children at the current position in the path, in this case **`data`,`*`,`record`** will return the following records:
169
+
A Field Path with a placeholder `*` selects all children at the current position in the path, so in this case **setting Field Path to `data`,`*`,`record`** will return the following records:
157
170
158
171
```
159
172
[
@@ -166,6 +179,87 @@ In this case a record selector with a placeholder `*` selects all children at th
166
179
]
167
180
```
168
181
182
+
### Record Filter
183
+
In some cases, certain certain records should be excluded from the final output of the connector, which can be accomplished through the Record Filter feature within the Record Selector component.
184
+
185
+
For example, say your API response looks like this:
186
+
```
187
+
[
188
+
{
189
+
"id": 1,
190
+
"status": "pending"
191
+
},
192
+
{
193
+
"id": 2,
194
+
"status": "active"
195
+
},
196
+
{
197
+
"id": 3,
198
+
"status": "expired"
199
+
}
200
+
]
201
+
```
202
+
and you only want to sync records for which the status is not `expired`.
203
+
204
+
You can accomplish this by setting the Record Filter to `{{ record.status != 'expired' }}`
205
+
206
+
Any records for which this expression evaluates to `true` will be emitted by the connector, and any for which it evaluates to `false` will be excluded from the output.
207
+
208
+
Note that Record Filter value must be an [interpolated string](/connector-development/config-based/advanced-topics#string-interpolation) with the filtering condition placed inside double curly braces `{{ }}`.
209
+
210
+
### Cast Record Fields to Schema Types
211
+
Sometimes the type of a field in the record is not the desired type. If the existing field type can be simply cast to the desired type, this can be solved by setting the stream's declared schema to the desired type and enabling `Cast Record Fields to Schema Types`.
212
+
213
+
For example, say the API response looks like this:
214
+
```
215
+
[
216
+
{
217
+
"street": "Kulas Light",
218
+
"city": "Gwenborough",
219
+
"geo": {
220
+
"lat": "-37.3159",
221
+
"lng": "81.1496"
222
+
}
223
+
},
224
+
{
225
+
"street": "Victor Plains",
226
+
"city": "Wisokyburgh",
227
+
"geo": {
228
+
"lat": "-43.9509",
229
+
"lng": "-34.4618"
230
+
}
231
+
}
232
+
]
233
+
```
234
+
Notice that the `lat` and `lng` values are strings despite them all being numeric. If you would rather have these fields contain raw number values in your output records, you can do the following:
235
+
- In the Declared Schema tab, disable `Automatically import detected schema`
236
+
- Change the `type` of the `lat` and `lng` fields from `string` to `number`
237
+
- Enable `Cast Record Fields to Schema Types` in the Record Selector component
238
+
239
+
This will cause those fields in the output records to be cast to the type declared in the schema, so the output records will now look like this:
240
+
```
241
+
[
242
+
{
243
+
"street": "Kulas Light",
244
+
"city": "Gwenborough",
245
+
"geo": {
246
+
"lat": -37.3159,
247
+
"lng": 81.1496
248
+
}
249
+
},
250
+
{
251
+
"street": "Victor Plains",
252
+
"city": "Wisokyburgh",
253
+
"geo": {
254
+
"lat": -43.9509,
255
+
"lng": -34.4618
256
+
}
257
+
}
258
+
]
259
+
```
260
+
Note that this casting is performed on a best-effort basis; if you tried to set the `city` field's type to `number` in the schema, for example, it would remain unchanged because those string values cannot be cast to numbers.
261
+
262
+
169
263
## Transformations
170
264
171
265
It is recommended to not change records during the extraction process the connector is performing, but instead load them into the downstream warehouse unchanged and perform necessary transformations there in order to stay flexible in what data is required. However there are some reasons that require the modifying the fields of records before they are sent to the warehouse:
@@ -230,7 +324,7 @@ Setting the "Path" of the remove-transformation to `content` removes these field
230
324
}
231
325
```
232
326
233
-
Like in case of the record selector, properties of deeply nested objects can be removed as well by specifying the path of properties to the target field that should be removed.
327
+
Like in case of the record selector's Field Path, properties of deeply nested objects can be removed as well by specifying the path of properties to the target field that should be removed.
0 commit comments