Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support aliased Destinations or Sources (for State Caches) #570

Open
Guilherme-B opened this issue Jan 6, 2025 · 3 comments
Open

Support aliased Destinations or Sources (for State Caches) #570

Guilherme-B opened this issue Jan 6, 2025 · 3 comments

Comments

@Guilherme-B
Copy link

The current implementation of the DestinationStreamStateModel assumes a one-to-one relationship between a Source name and a Destination name. This is fine in principle, under the assumption that the Connection only has one instance of the Source type. However, this is not always the case, lets imagine we have multiple cases of a Google Sheets Source, for which we want to keep its state tracked via state_cache when writing to a destination. How can we achieve this?

In Airbyte, we achieve this by aliasing the Sources and Destinations:
Image
Image

By separating the ORM model to write based on the aliased Source and Destination/Cache names, we can support multiple types of Sources and Destinations and still hold cache for them.

@aaronsteers
Copy link
Contributor

aaronsteers commented Jan 24, 2025

@Guilherme-B Yes, I think this makes a lot of sense. Thanks for logging this.

I could see get_source() and get_destination() accepting an optional 'alias' keyword arg. That would be a fairly straightforward implementation, I think. And then the cache would use connector alias when provided, otherwise connector name. Wdyt?

@aaronsteers
Copy link
Contributor

@Guilherme-B - Do you by any chance have cycles to pick this up? I've added accepting_pull_requests label to indicate I think it is ready to move forward (assuming the spec above sounds approximately correct).

@Guilherme-B
Copy link
Author

Hey @aaronsteers , I have created a bypass for this by renaming both the source and destinations on the fly, it works given the cache uses those names to write the state. However I don't feel like this is a proper solution.

Will investigate the usage of an alias property and create a PR for it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants