-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Source PostHog: Fix events stream pagination #44016
Source PostHog: Fix events stream pagination #44016
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
type: RequestPath | ||
$parameters: | ||
url_base: "#/definitions/requester/url_base" | ||
type: RequestOption |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't particularly mind, but generally cursor should be better than offset/increment. Can you link me to posthog docs explaining whether that stream does / does not have the cursor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hypothetically, the response.next should be there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I understand the change and it doesn't break any new tests.
type: RequestPath | ||
$parameters: | ||
url_base: "#/definitions/requester/url_base" | ||
type: RequestOption |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hypothetically, the response.next should be there.
type: RequestPath | ||
$parameters: | ||
url_base: "#/definitions/requester/url_base" | ||
type: RequestOption |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see.
@airbytehq/dev-tooling tagging you to check out a PR where regression tests can't run with "NotGitRepo" — I think that's because it's from a fork? |
What
This PR contains two changes:
events
, that was raised through issue [source-posthog] Connector historical sync #33915. Indeed, the stream was using a cursor based pagination. However, the pagination in PostHog API is broken (see issue): the cursor (fieldnext
in API response) returns the same URL that was called, causing Airbyte stream to loop infinitely over the same page. There is no plan on PostHog end to fix this as the API endpoint is deprecated and kept only for backward compatibility (see documentation).README.md
according to the poetry update.How
For the issue on the
events
stream, there is a possible workaround, by using an offset based pagination. PostHog's API haslimit
andoffset
query parameters that are working. Once the pagination returns no results, the result isnull
.Therefore I updated the
manifest.yaml
keydefinitions.events_stream.retriever.paginator.pagination_strategy
to anOffsetIncrement
.Review guide
airbyte-integrations/connectors/source-posthog/source_posthog/manifest.yaml
airbyte-integrations/connectors/source-posthog/README.md
docs/integrations/sources/posthog.md
User Impact
Instead of an infinite loop and a failing stream, the user will get all events.
Can this PR be safely reverted and rolled back?