Skip to content

Commit 90162b6

Browse files
Encoding is ISO-8859-1 (#12552)
* Encoding is ISO-8859-1 * rename test * bump * auto-bump connector version Co-authored-by: Octavia Squidington III <[email protected]>
1 parent 3d41612 commit 90162b6

File tree

6 files changed

+48
-47
lines changed

6 files changed

+48
-47
lines changed

airbyte-config/init/src/main/resources/seed/source_definitions.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -695,7 +695,7 @@
695695
- name: Salesforce
696696
sourceDefinitionId: b117307c-14b6-41aa-9422-947e34922962
697697
dockerRepository: airbyte/source-salesforce
698-
dockerImageTag: 1.0.6
698+
dockerImageTag: 1.0.7
699699
documentationUrl: https://docs.airbyte.io/integrations/sources/salesforce
700700
icon: salesforce.svg
701701
sourceType: api

airbyte-config/init/src/main/resources/seed/source_specs.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -7360,7 +7360,7 @@
73607360
supportsNormalization: false
73617361
supportsDBT: false
73627362
supported_destination_sync_modes: []
7363-
- dockerImage: "airbyte/source-salesforce:1.0.6"
7363+
- dockerImage: "airbyte/source-salesforce:1.0.7"
73647364
spec:
73657365
documentationUrl: "https://docs.airbyte.com/integrations/sources/salesforce"
73667366
connectionSpecification:

airbyte-integrations/connectors/source-salesforce/Dockerfile

+1-1
Original file line numberDiff line numberDiff line change
@@ -13,5 +13,5 @@ RUN pip install .
1313

1414
ENTRYPOINT ["python", "/airbyte/integration_code/main.py"]
1515

16-
LABEL io.airbyte.version=1.0.6
16+
LABEL io.airbyte.version=1.0.7
1717
LABEL io.airbyte.name=airbyte/source-salesforce

airbyte-integrations/connectors/source-salesforce/source_salesforce/streams.py

+3-2
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,7 @@
3434
class SalesforceStream(HttpStream, ABC):
3535
page_size = 2000
3636
transformer = TypeTransformer(TransformConfig.DefaultSchemaNormalization)
37+
encoding = "ISO-8859-1"
3738

3839
def __init__(
3940
self, sf_api: Salesforce, pk: str, stream_name: str, sobject_options: Mapping[str, Any] = None, schema: dict = None, **kwargs
@@ -274,7 +275,7 @@ def download_data(self, url: str, chunk_size: float = 1024) -> os.PathLike:
274275
with closing(self._send_http_request("GET", f"{url}/results", stream=True)) as response:
275276
with open(tmp_file, "w") as data_file:
276277
for chunk in response.iter_content(chunk_size=chunk_size):
277-
data_file.writelines(self.filter_null_bytes(chunk.decode("utf-8")))
278+
data_file.writelines(self.filter_null_bytes(chunk.decode(self.encoding)))
278279
# check the file exists
279280
if os.path.isfile(tmp_file):
280281
return tmp_file
@@ -288,7 +289,7 @@ def read_with_chunks(self, path: str = None, chunk_size: int = 100) -> Iterable[
288289
@ chunk_size: int - the number of lines to read at a time, default: 100 lines / time.
289290
"""
290291
try:
291-
with open(path, "r", encoding="utf-8") as data:
292+
with open(path, "r", encoding=self.encoding) as data:
292293
chunks = pd.read_csv(data, chunksize=chunk_size, iterator=True, dialect="unix")
293294
for chunk in chunks:
294295
chunk = chunk.replace({nan: None}).to_dict(orient="records")

airbyte-integrations/connectors/source-salesforce/unit_tests/api_test.py

+4
Original file line numberDiff line numberDiff line change
@@ -525,3 +525,7 @@ def test_convert_to_standard_instance(stream_config, stream_api):
525525
bulk_stream = generate_stream("Account", stream_config, stream_api)
526526
rest_stream = bulk_stream.get_standard_instance()
527527
assert isinstance(rest_stream, IncrementalSalesforceStream)
528+
529+
530+
def test_decoding():
531+
assert b"0\xe5".decode(SalesforceStream.encoding) == "0å"

docs/integrations/sources/salesforce.md

+38-42
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,6 @@ This page guides you through the process of setting up the Salesforce source con
1414

1515
While you can set up the Salesforce connector using any Salesforce user with read permission, we recommend creating a dedicated read-only user for Airbyte. This allows you to granularly control the data Airbyte can read.
1616

17-
1817
To create a dedicated read only Salesforce user:
1918

2019
1. [Log into Salesforce](https://login.salesforce.com/) with an admin account.
@@ -25,12 +24,12 @@ To create a dedicated read only Salesforce user:
2524
6. Scroll down to the **Standard Object Permissions** and **Custom Object Permissions** and enable the **Read** checkbox for objects that you want to replicate via Airbyte.
2625
7. Scroll to the top and click **Save**.
2726
8. On the left side, under Administration, click **Users** > **Users**. The All Users page is displayed. Click **New User**.
28-
9. Fill out the required fields:
29-
1. For License, select **Salesforce**.
30-
2. For Profile, select **Airbyte Read Only User**.
27+
9. Fill out the required fields:
28+
1. For License, select **Salesforce**.
29+
2. For Profile, select **Airbyte Read Only User**.
3130
3. For Email, make sure to use an email address that you can access.
3231
10. Click **Save**.
33-
11. Copy the Username and keep it accessible.
32+
11. Copy the Username and keep it accessible.
3433
12. Log into the email you used above and verify your new Salesforce account user. You'll need to set a password as part of this process. Keep this password accessible.
3534

3635
## Step 2: Set up Salesforce as a Source in Airbyte
@@ -72,20 +71,17 @@ The Salesforce source connector supports the following sync modes:
7271
**Incremental Deletes Sync**
7372
<br/>The Salesforce connector retrieves deleted records from Salesforce. For the streams which support it, a deleted record will be marked with the field `isDeleted=true` value.
7473

75-
7674
## Performance considerations
7775

78-
The Salesforce connector is restricted by Salesforce’s [Daily Rate Limits](https://developer.salesforce.com/docs/atlas.en-us.salesforce_app_limits_cheatsheet.meta/salesforce_app_limits_cheatsheet/salesforce_app_limits_platform_api.htm). The connector syncs data until it hits the daily rate limit, then ends the sync early with success status, and starts the next sync from where it left off. Note that picking up from where it ends will work only for incremental sync, which is why we recommend using the [Incremental Sync - Deduped History](https://docs.airbyte.com/understanding-airbyte/connections/incremental-deduped-history) sync mode.
79-
76+
The Salesforce connector is restricted by Salesforce’s [Daily Rate Limits](https://developer.salesforce.com/docs/atlas.en-us.salesforce_app_limits_cheatsheet.meta/salesforce_app_limits_cheatsheet/salesforce_app_limits_platform_api.htm). The connector syncs data until it hits the daily rate limit, then ends the sync early with success status, and starts the next sync from where it left off. Note that picking up from where it ends will work only for incremental sync, which is why we recommend using the [Incremental Sync - Deduped History](https://docs.airbyte.com/understanding-airbyte/connections/incremental-deduped-history) sync mode.
8077

8178
## Supported Objects
8279

8380
The Salesforce connector supports reading both Standard Objects and Custom Objects from Salesforce. Each object is read as a separate stream. See a list of all Salesforce Standard Objects [here](https://developer.salesforce.com/docs/atlas.en-us.object_reference.meta/object_reference/sforce_api_objects_list.htm).
8481

85-
8682
Airbyte fetches and handles all the possible and available streams dynamically based on:
8783

88-
* If the authenticated Salesforce user has the Role and Permissions to read and fetch objects
84+
* If the authenticated Salesforce user has the Role and Permissions to read and fetch objects
8985

9086
* If the stream has the queryable property set to true. Airbyte can fetch only queryable streams via the API. If you don’t see your object available via Airbyte, check if it is API-accessible to the Salesforce user you authenticated with in Step 2.
9187

@@ -117,37 +113,37 @@ Now that you have set up the Salesforce source connector, check out the followin
117113
* [Replicate Salesforce data to BigQuery](https://airbyte.com/tutorials/replicate-salesforce-data-to-bigquery)
118114
* [Replicate Salesforce and Zendesk data to Keen for unified analytics](https://airbyte.com/tutorials/salesforce-zendesk-analytics)
119115

120-
121116
## Changelog
122117

123-
| Version | Date | Pull Request | Subject |
124-
|:--------|:-----------|:---|:---------------------------------------------------------------------------------------------------------------------------------|
125-
| 1.0.4 | 2022-04-27 | [12335](https://github.com/airbytehq/airbyte/pull/12335) | Adding fixtures to mock time.sleep for connectors that explicitly sleep |
126-
| 1.0.3 | 2022-04-04 | [11692](https://github.com/airbytehq/airbyte/pull/11692) | Optimised memory usage for `BULK` API calls |
127-
| 1.0.2 | 2022-03-01 | [10751](https://github.com/airbytehq/airbyte/pull/10751) | Fix broken link anchor in connector configuration |
128-
| 1.0.1 | 2022-02-27 | [10679](https://github.com/airbytehq/airbyte/pull/10679) | Reorganize input parameter order on the UI |
129-
| 1.0.0 | 2022-02-27 | [10516](https://github.com/airbytehq/airbyte/pull/10516) | Speed up schema discovery by using parallelism |
130-
| 0.1.23 | 2022-02-10 | [10141](https://github.com/airbytehq/airbyte/pull/10141) | Processing of failed jobs |
131-
| 0.1.22 | 2022-02-02 | [10012](https://github.com/airbytehq/airbyte/pull/10012) | Increase CSV field_size_limit |
132-
| 0.1.21 | 2022-01-28 | [9499](https://github.com/airbytehq/airbyte/pull/9499) | If a sync reaches daily rate limit it ends the sync early with success status. Read more in `Performance considerations` section |
133-
| 0.1.20 | 2022-01-26 | [9757](https://github.com/airbytehq/airbyte/pull/9757) | Parse CSV with "unix" dialect |
134-
| 0.1.19 | 2022-01-25 | [8617](https://github.com/airbytehq/airbyte/pull/8617) | Update connector fields title/description |
135-
| 0.1.18 | 2022-01-20 | [9478](https://github.com/airbytehq/airbyte/pull/9478) | Add available stream filtering by `queryable` flag |
136-
| 0.1.17 | 2022-01-19 | [9302](https://github.com/airbytehq/airbyte/pull/9302) | Deprecate API Type parameter |
137-
| 0.1.16 | 2022-01-18 | [9151](https://github.com/airbytehq/airbyte/pull/9151) | Fix pagination in REST API streams |
138-
| 0.1.15 | 2022-01-11 | [9409](https://github.com/airbytehq/airbyte/pull/9409) | Correcting the presence of an extra `else` handler in the error handling |
139-
| 0.1.14 | 2022-01-11 | [9386](https://github.com/airbytehq/airbyte/pull/9386) | Handling 400 error, while `sobject` doesn't support `query` or `queryAll` requests |
140-
| 0.1.13 | 2022-01-11 | [8797](https://github.com/airbytehq/airbyte/pull/8797) | Switched from authSpecification to advanced_auth in specefication |
141-
| 0.1.12 | 2021-12-23 | [8871](https://github.com/airbytehq/airbyte/pull/8871) | Fix `examples` for new field in specification |
142-
| 0.1.11 | 2021-12-23 | [8871](https://github.com/airbytehq/airbyte/pull/8871) | Add the ability to filter streams by user |
143-
| 0.1.10 | 2021-12-23 | [9005](https://github.com/airbytehq/airbyte/pull/9005) | Handling 400 error when a stream is not queryable |
144-
| 0.1.9 | 2021-12-07 | [8405](https://github.com/airbytehq/airbyte/pull/8405) | Filter 'null' byte(s) in HTTP responses |
145-
| 0.1.8 | 2021-11-30 | [8191](https://github.com/airbytehq/airbyte/pull/8191) | Make `start_date` optional and change its format to `YYYY-MM-DD` |
146-
| 0.1.7 | 2021-11-24 | [8206](https://github.com/airbytehq/airbyte/pull/8206) | Handling 400 error when trying to create a job for sync using Bulk API. |
147-
| 0.1.6 | 2021-11-16 | [8009](https://github.com/airbytehq/airbyte/pull/8009) | Fix retring of BULK jobs |
148-
| 0.1.5 | 2021-11-15 | [7885](https://github.com/airbytehq/airbyte/pull/7885) | Add `Transform` for output records |
149-
| 0.1.4 | 2021-11-09 | [7778](https://github.com/airbytehq/airbyte/pull/7778) | Fix types for `anyType` fields |
150-
| 0.1.3 | 2021-11-06 | [7592](https://github.com/airbytehq/airbyte/pull/7592) | Fix getting `anyType` fields using BULK API |
151-
| 0.1.2 | 2021-09-30 | [6438](https://github.com/airbytehq/airbyte/pull/6438) | Annotate Oauth2 flow initialization parameters in connector specification |
152-
| 0.1.1 | 2021-09-21 | [6209](https://github.com/airbytehq/airbyte/pull/6209) | Fix bug with pagination for BULK API |
153-
| 0.1.0 | 2021-09-08 | [5619](https://github.com/airbytehq/airbyte/pull/5619) | Salesforce Aitbyte-Native Connector |
118+
| Version | Date | Pull Request | Subject |
119+
|:--------|:-----------|:-------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------|
120+
| 1.0.7 | 2022-04-27 | [12552](https://github.com/airbytehq/airbyte/pull/12552) | Decode responses as ISO-8859-1 instead of utf-8 |
121+
| 1.0.4 | 2022-04-27 | [12335](https://github.com/airbytehq/airbyte/pull/12335) | Adding fixtures to mock time.sleep for connectors that explicitly sleep |
122+
| 1.0.3 | 2022-04-04 | [11692](https://github.com/airbytehq/airbyte/pull/11692) | Optimised memory usage for `BULK` API calls |
123+
| 1.0.2 | 2022-03-01 | [10751](https://github.com/airbytehq/airbyte/pull/10751) | Fix broken link anchor in connector configuration |
124+
| 1.0.1 | 2022-02-27 | [10679](https://github.com/airbytehq/airbyte/pull/10679) | Reorganize input parameter order on the UI |
125+
| 1.0.0 | 2022-02-27 | [10516](https://github.com/airbytehq/airbyte/pull/10516) | Speed up schema discovery by using parallelism |
126+
| 0.1.23 | 2022-02-10 | [10141](https://github.com/airbytehq/airbyte/pull/10141) | Processing of failed jobs |
127+
| 0.1.22 | 2022-02-02 | [10012](https://github.com/airbytehq/airbyte/pull/10012) | Increase CSV field_size_limit |
128+
| 0.1.21 | 2022-01-28 | [9499](https://github.com/airbytehq/airbyte/pull/9499) | If a sync reaches daily rate limit it ends the sync early with success status. Read more in `Performance considerations` section |
129+
| 0.1.20 | 2022-01-26 | [9757](https://github.com/airbytehq/airbyte/pull/9757) | Parse CSV with "unix" dialect |
130+
| 0.1.19 | 2022-01-25 | [8617](https://github.com/airbytehq/airbyte/pull/8617) | Update connector fields title/description |
131+
| 0.1.18 | 2022-01-20 | [9478](https://github.com/airbytehq/airbyte/pull/9478) | Add available stream filtering by `queryable` flag |
132+
| 0.1.17 | 2022-01-19 | [9302](https://github.com/airbytehq/airbyte/pull/9302) | Deprecate API Type parameter |
133+
| 0.1.16 | 2022-01-18 | [9151](https://github.com/airbytehq/airbyte/pull/9151) | Fix pagination in REST API streams |
134+
| 0.1.15 | 2022-01-11 | [9409](https://github.com/airbytehq/airbyte/pull/9409) | Correcting the presence of an extra `else` handler in the error handling |
135+
| 0.1.14 | 2022-01-11 | [9386](https://github.com/airbytehq/airbyte/pull/9386) | Handling 400 error, while `sobject` doesn't support `query` or `queryAll` requests |
136+
| 0.1.13 | 2022-01-11 | [8797](https://github.com/airbytehq/airbyte/pull/8797) | Switched from authSpecification to advanced_auth in specefication |
137+
| 0.1.12 | 2021-12-23 | [8871](https://github.com/airbytehq/airbyte/pull/8871) | Fix `examples` for new field in specification |
138+
| 0.1.11 | 2021-12-23 | [8871](https://github.com/airbytehq/airbyte/pull/8871) | Add the ability to filter streams by user |
139+
| 0.1.10 | 2021-12-23 | [9005](https://github.com/airbytehq/airbyte/pull/9005) | Handling 400 error when a stream is not queryable |
140+
| 0.1.9 | 2021-12-07 | [8405](https://github.com/airbytehq/airbyte/pull/8405) | Filter 'null' byte(s) in HTTP responses |
141+
| 0.1.8 | 2021-11-30 | [8191](https://github.com/airbytehq/airbyte/pull/8191) | Make `start_date` optional and change its format to `YYYY-MM-DD` |
142+
| 0.1.7 | 2021-11-24 | [8206](https://github.com/airbytehq/airbyte/pull/8206) | Handling 400 error when trying to create a job for sync using Bulk API. |
143+
| 0.1.6 | 2021-11-16 | [8009](https://github.com/airbytehq/airbyte/pull/8009) | Fix retring of BULK jobs |
144+
| 0.1.5 | 2021-11-15 | [7885](https://github.com/airbytehq/airbyte/pull/7885) | Add `Transform` for output records |
145+
| 0.1.4 | 2021-11-09 | [7778](https://github.com/airbytehq/airbyte/pull/7778) | Fix types for `anyType` fields |
146+
| 0.1.3 | 2021-11-06 | [7592](https://github.com/airbytehq/airbyte/pull/7592) | Fix getting `anyType` fields using BULK API |
147+
| 0.1.2 | 2021-09-30 | [6438](https://github.com/airbytehq/airbyte/pull/6438) | Annotate Oauth2 flow initialization parameters in connector specification |
148+
| 0.1.1 | 2021-09-21 | [6209](https://github.com/airbytehq/airbyte/pull/6209) | Fix bug with pagination for BULK API |
149+
| 0.1.0 | 2021-09-08 | [5619](https://github.com/airbytehq/airbyte/pull/5619) | Salesforce Aitbyte-Native Connector |

0 commit comments

Comments
 (0)