Skip to content

postgres is a bad datawarehouse - column size limitations #36453

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 25, 2024

Conversation

evantahler
Copy link
Contributor

No description provided.

Copy link

vercel bot commented Mar 25, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
airbyte-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 25, 2024 5:08pm

@octavia-squidington-iii octavia-squidington-iii added the area/documentation Improvements or additions to documentation label Mar 25, 2024
@evantahler evantahler changed the title Update postgres.md postgres is a bad datawarehouse - column size limitations Mar 25, 2024
@evantahler evantahler marked this pull request as ready for review March 25, 2024 16:55
@evantahler evantahler requested a review from gisripa March 25, 2024 16:55
@evantahler evantahler enabled auto-merge (squash) March 25, 2024 16:55
@evantahler evantahler merged commit 101bd43 into master Mar 25, 2024
31 checks passed
@evantahler evantahler deleted the evantahler-patch-1 branch March 25, 2024 17:08
Postgres, while an excellent relational database, is not a data warehouse.

1. Postgres is likely to perform poorly with large data volumes. Even postgres-compatible destinations (e.g. AWS Aurora) are not immune to slowdowns when dealing with large writes or updates over ~500GB. Especially when using normalization with `destination-postgres`, be sure to monitor your database's memory and CPU usage during your syncs. It is possible for your destination to 'lock up', and incur high usage costs with large sync volumes.
2. Postgres column size limitations are likley to cause colisions when used as a destination reciving data from highly-nested and flattened sources.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2. Postgres column size limitations are likley to cause colisions when used as a destination reciving data from highly-nested and flattened sources.
2. Postgres column [name length limitations](https://www.postgresql.org/docs/current/limits.html) are likely to cause collisions when used as a destination receiving data from highly-nested and flattened sources, e.g. `{63 byte name}_a` and `{63 byte name}_b` will both be truncated to `{63 byte name}` which causes postgres to throw an error that a duplicate column name was specified.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a little context feel free to reword

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants