Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metadata-service [orchestrator]: improve stale metadata detection #42962

Conversation

alafanechere
Copy link
Contributor

@alafanechere alafanechere commented Aug 2, 2024

What

Closes to https://github.com/airbytehq/airbyte-internal-issues/issues/8985

Problem

Since the orchestrator now adds generated field to metadata file the MD5 hash / etag comparison of master vs GCS does not work anymore.

Solution

  • Read all latest metadata from GCS
  • Read all metadata from Github on our master branch
  • Add a connector to the stale metadata report if its root dockerImageTag on GCS is different from what we have on master for more than 2 hours.

Other changes

The daily frequency (and no message on success) of the stale metadata detection jobs give the false impression that it's not working.
I'd like to increase its frequency (hourly) and make it send a message to a slack even when no stale metadata was detected. This will make sure we have reassuring ping that this alerting is working.

Copy link

vercel bot commented Aug 2, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
airbyte-docs ⬜️ Ignored (Inspect) Visit Preview Aug 9, 2024 6:47am

Copy link
Contributor Author

alafanechere commented Aug 2, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @alafanechere and the rest of your teammates on Graphite Graphite

@alafanechere alafanechere marked this pull request as ready for review August 2, 2024 13:12
@alafanechere alafanechere requested review from a team and bnchrch August 2, 2024 13:12
@alafanechere alafanechere force-pushed the augustin/08-02-metadata-service_orchestrator_improve_stale_metadata_detection branch from 6537e9b to cbeab52 Compare August 2, 2024 13:13
@alafanechere alafanechere marked this pull request as draft August 2, 2024 14:54
Base automatically changed from augustin/08-02-metadata-service_orchestrator_fix_unit_tests to master August 2, 2024 23:17
@alafanechere alafanechere force-pushed the augustin/08-02-metadata-service_orchestrator_improve_stale_metadata_detection branch 2 times, most recently from 44186a8 to 6cf4490 Compare August 8, 2024 15:46
@alafanechere alafanechere marked this pull request as ready for review August 8, 2024 18:34
@alafanechere alafanechere force-pushed the augustin/08-02-metadata-service_orchestrator_improve_stale_metadata_detection branch from 6cf4490 to 259a837 Compare August 8, 2024 18:34
@alafanechere alafanechere force-pushed the augustin/08-02-metadata-service_orchestrator_improve_stale_metadata_detection branch from 259a837 to 5f21fc9 Compare August 8, 2024 18:37
Copy link
Contributor

@natikgadzhi natikgadzhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't tested, but the code looks solid

@@ -189,7 +189,7 @@
),
ScheduleDefinition(job=generate_connector_test_summary_reports, cron_schedule="@hourly"),
ScheduleDefinition(
cron_schedule="0 8 * * *", # Daily at 8am US/Pacific
cron_schedule="0 * * * *", # Every hour
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is nice!

@@ -1,6 +1,6 @@
[tool.poetry]
name = "orchestrator"
version = "0.1.4"
version = "0.1.5"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: That's prob a 0.2 at least in my book.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I consider it's a fix of something which was working before, this is why I made a patch bump.

@@ -130,7 +130,7 @@ def metadata_definitions(context: OpExecutionContext) -> List[LatestMetadataEntr
metadata_entry = LatestMetadataEntry(
metadata_definition=metadata_def,
icon_url=icon_url,
last_modified=blob.last_modified,
last_modified=blob.updated.isoformat(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, non-blocking: is this for formatting purposes mostly here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the LastMetadataEntry.last_modified attribute takes a string - I did not bother changing it to a datetime.

latest_versions_on_gcs = {
metadata_entry.metadata_definition.data.dockerRepository: metadata_entry.metadata_definition.data.dockerImageTag
for metadata_entry in metadata_definitions
if metadata_entry.metadata_definition.data.supportLevel != "archived"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice touch with archived connectors.

@alafanechere alafanechere force-pushed the augustin/08-02-metadata-service_orchestrator_improve_stale_metadata_detection branch from 5f21fc9 to 71e2ef6 Compare August 9, 2024 06:47
@alafanechere alafanechere enabled auto-merge (squash) August 9, 2024 06:48
@alafanechere alafanechere merged commit 6a6e54b into master Aug 9, 2024
31 checks passed
@alafanechere alafanechere deleted the augustin/08-02-metadata-service_orchestrator_improve_stale_metadata_detection branch August 9, 2024 06:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants