Skip to content

Source Salesforce: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc3 in position 8379: unexpected end of data #12372

Closed
@marcosmarxm

Description

@marcosmarxm

Is this your first time deploying Airbyte?: No
OS Version: debian-10
Memory / Disk: 16GB
Deployment: Docker
Airbyte Version: 0.36.1-alpha
Source name/version: Salesforce 1.0.3
Destination name/version: BigQuery 1.1.1
Step: Salesforce fails to sync
Description:
After running a Salesforce incremental sync for two weeks, everything worked fine until yesterday when I encountered the same error in product and staging. After retrying staging the error solved itself but this didn’t work in production.
I retried production more than 10 times already with different sync modes(incremental, full replication etc) but nothing worked. I also reset the incremental sync and started from scratch but that didn’t work as well. As a last resort, I deleted the Airbyte connection and all BigQuery data and tried running everything from scratch, but the error was still there.
Please note that production and staging are fetching from the same salesforce enviriment just writing to different BigQuery projects. Also, I have one Airbyte instance for both staging/prod.
See the logs below

logs-529.txt

From Discourse: https://discuss.airbyte.io/t/salesforce-unicodedecodeerror/697

From user in Discourse:

In airbyte-integrations/connectors/source-salesforce/source_salesforce/source.py line 276 1 I found the chunk ends with \xc3. Which is only part of the ü (\xc3\xbc) character which it should represent. So I think chunking and then decoding utf-8 causes the issue.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions