Skip to content

CDK: Add support for streams with state attribute (state v2) #9723

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
keu opened this issue Jan 23, 2022 · 0 comments · Fixed by #9746
Closed

CDK: Add support for streams with state attribute (state v2) #9723

keu opened this issue Jan 23, 2022 · 0 comments · Fixed by #9746
Labels
CDK Connector Development Kit needs-triage type/enhancement New feature or request

Comments

@keu
Copy link
Contributor

keu commented Jan 23, 2022

Tell us about the problem you're trying to solve

Facebook Marketing is a good example of why the state should be an explicit attribute of the stream.
An incremental stream is a cursor slicing through the remote data, it is wrong to assume that the state (cursor position) depends only on the latest record that we read. Therefore we should use explicit state get and set constructions (the version that we used in base python before introducing CDK). This will allow setting the state explicitly before any reading operation (stream_slices depends on state value because we save state for each slice individually in FB Marketing).

Another example:
Let's say we have reversed incremental stream, that stream updates its state only once - at the end of the reading.
This makes it impossible to update the state with get_updated_state AFTER the last record has been emitted.

Describe the solution you’d like

class MyStream:

     @property
     def state(self):
            return {self.cursor_value: "some_date", ...}

    @state.setter
    def state(self, value):
           self._state = parse(value, ...)

in source

     if hasattr(stream, state):
         stream.state = stream_state_value

    ...
    state_checkpoint(stream.state)

Later state attribute could be used to unambiguously detect if the stream is incremental (the current way is to check cursor_field, which is rather weird).

@keu keu added type/enhancement New feature or request CDK Connector Development Kit needs-triage labels Jan 23, 2022
@keu keu linked a pull request Jan 24, 2022 that will close this issue
@keu keu closed this as completed in #9746 Feb 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CDK Connector Development Kit needs-triage type/enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant