Description
Connector Name
source-pipedrive
Connector Version
0.1.18
What step the error happened?
During the sync
Revelant information
Airbyte version: 0.44.12
Pipedrive connector version: 0.1.18
We are missing data from Pipedrive. Specifically organizations and persons, but it will likely also affect other streams.
Below you see step by step description showing that there are more than 500 organizations in Pipedrive which were updated after 2023-06-20 03:14:34
, but only 300 of them are synced by Airbyte. We have a strong feeling that it has something to do with pagination and/or the cursor field.
- Stream state in Airbyte:
{
"streamDescriptor": {
"name": "organizations"
},
"streamState": {
"update_time": "2023-06-20 03:14:34"
}
}
- Call via Postman with
since_timestamp=2023-06-20 03:14:34
:
https://api.pipedrive.com/v1/recents?since_timestamp=2023-06-20 03:14:34&items=organization&start=0&limit=500
- Response includes 500 organizations and the following metadata:
"additional_data": {
"since_timestamp": "2023-06-20 03:14:34",
"last_timestamp_on_page": "2023-06-20 09:09:29",
"pagination": {
"start": 0,
"limit": 500,
"more_items_in_collection": true,
"next_start": 500
}
}
Thus there are more than 500 organizations. And indeed when calling with start=500&limit=500
we get more organizations.
- Now, when we start the Airbyte Pipedrive connection sync for organization with the Stream state as shown above, we get:
2023-06-20 10:07:05 destination > Starting a new buffer for stream pipedrive__organizations (current state: 848 KB in 5 buffers)
2023-06-20 10:07:05 destination > Default schema.
2023-06-20 10:07:06 source > Read 300 records from organizations stream
2023-06-20 10:07:06 source > Marking stream organizations as STOPPED
2023-06-20 10:07:06 source > Finished syncing organizations
...
{
"streamName" : "pipedrive__organizations",
"stats" : {
"bytesCommitted" : 1576429,
"bytesEmitted" : 1576429,
"recordsEmitted" : 300,
"recordsCommitted" : 300
}
}
Thus less then the 500+ we found when calling the endpoint ourselves. Also suspicious that it is exactly 300...
Furthermore, looking at some other streams the numbers are also suspicious:
{
"streamName" : "pipedrive__deals",
"stats" : {
"bytesCommitted" : 88918,
"bytesEmitted" : 88918,
"recordsEmitted" : 50,
"recordsCommitted" : 50
}
}
{
"streamName" : "pipedrive__persons",
"stats" : {
"bytesCommitted" : 234142,
"bytesEmitted" : 234142,
"recordsEmitted" : 100,
"recordsCommitted" : 100
}
}
PS. Remark that if you call the recents endpoint with limit=x
where x>500
, Pipedrive will ignore that value and just use limit=500
.
Relevant log output
No response
Contribute
- Yes, I want to contribute