Skip to content

Development: Add Helios push based lifecycle monitoring with manual DB migration status events #10873

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 21 commits into from
Jul 11, 2025

Conversation

egekocabas
Copy link
Member

@egekocabas egekocabas commented May 18, 2025

Checklist

General

Motivation and Context

Note: Before diving into this PR’s details, please take a moment to read the library’s README

Helios currently pulls each environment’s status; this library flips that to a push model:

  • Services actively send their lifecycle state to one or more Helios endpoints.
  • We cover STARTING_UP, RUNNING heart-beats, SHUTTING_DOWN, and FAILED.
  • By default the starter does NOT watch your database migrations—this PR adds explicit, manual pushes for:
    • DB_MIGRATION_STARTED
    • DB_MIGRATION_FINISHED
    • DB_MIGRATION_FAILED

Description

  • Added dependency de.tum.cit.aet:helios-status-spring-starter (~22 KB)

  • New helios.status.* YAML block (see example below).

  • Runtime behaviour

    • BootLifecycleListener inside the starter pushes Spring lifecycle events; HeartbeatScheduler sends RUNNING every 30s.
  • DB migration integration

    • Since Liquibase don’t expose events, we inject a Optional<HeliosClient> into DatabaseMigration and LiquibaseConfiguration to call:
      helios.pushDbMigrationStarted();
      // … run migration …
      helios.pushDbMigrationFinished();
      // on exception:
      helios.pushDbMigrationFailed();
  • Example application.yml (The YAML is set and populated to the nodes in the deployment)

helios:
  status:
    enabled: false         # ← master switch. false = do nothing (default)
    environment-name: ""   # ← MUST equal the exact GitHub Actions environment name
    endpoints:             # ← one or more Helios instances to push to
      - url: https://helios.aet.cit.tum.de/api/environments/status            #     instance REST URL (…/api/environments/status)
        secret-key: ${HELIOS_PROD_SECRET_KEY}     #     repo-specific secret; generated in Helios UI
      # - url: …           #   add a second entry for prod, etc.
key required? purpose & rules
enabled no (default false) Turns the feature on. Keep it false everywhere until the repo has a secret and the environment in Helios is configured for Push Update.
environment-name yes when enabled Must match the exact name of the GitHub environment ( not the display label shown in Helios ).Examples: artemis-test1.artemis.cit.tum.de, artemis.tum.de, …
endpoints yes when enabled At least one url = Helios REST endpoint, ending in /api/environments/status. secret-key = repo-specific token generated in the Helios UI. Add a second entry if you also want to push to staging, etc.

Steps for Testing

Prerequisites:

  • 1 user with WRITE access to the Artemis GitHub repository

Steps

  • Open https://helios.aet.cit.tum.de/ and select Artemis.
  • Find this PR and deploy to one of the test servers
  • See the status badge changes while deploying (Shutting down, Migrating DB, Migration finished, Running)
  • Open the test server and validate it is up and functional
  • (Optional): See the test server logs and search for helios

Screenshots

  • A few points to note:
    • The environment list view in Helios is updated at regular intervals. This means that if multiple status updates are pushed within a short time frame, only the most recent one may be visible. For example, during application startup, the Starting up event might be immediately followed by Migrating DB, and only the latter will appear in the UI.
    • Additionally, the manual dispatch of database-related events may occur too close to the application's startup lifecycle event. Since the library hooks into Spring lifecycle events, it's possible that both events are sent almost simultaneously, causing the Starting up status to be overwritten before it can be displayed.
Screen.Recording.2025-05-19.at.22.47.25.mov

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Added integration with Helios for push-based database migration status updates, providing real-time notifications on migration progress and failures.
  • Chores

    • Updated dependencies and configuration to support Helios integration.

@github-project-automation github-project-automation bot moved this to Work In Progress in Artemis Development May 18, 2025
@egekocabas egekocabas changed the title add helios status updates library Development: Add Helios status updates library May 18, 2025
@github-actions github-actions bot added the config-change Pull requests that change the config in a way that they require a deployment via Ansible. label May 18, 2025
@helios-aet helios-aet bot temporarily deployed to artemis-test7.artemis.cit.tum.de May 18, 2025 23:28 Inactive
@helios-aet helios-aet bot temporarily deployed to artemis-test6.artemis.cit.tum.de May 18, 2025 23:28 Inactive
@egekocabas egekocabas changed the title Development: Add Helios status updates library Development: Add Helios status updates May 19, 2025
@github-actions github-actions bot added server Pull requests that update Java code. (Added Automatically!) core Pull requests that affect the corresponding module labels May 19, 2025
@egekocabas egekocabas linked an issue May 29, 2025 that may be closed by this pull request

This comment was marked as outdated.

@ls1intum ls1intum deleted a comment from github-actions bot May 30, 2025
@ls1intum ls1intum deleted a comment from github-actions bot May 30, 2025
@ls1intum ls1intum deleted a comment from github-actions bot May 30, 2025
@ls1intum ls1intum deleted a comment from github-actions bot May 30, 2025
@ls1intum ls1intum deleted a comment from github-actions bot May 30, 2025
coderabbitai[bot]
coderabbitai bot previously approved these changes Jul 7, 2025

This comment was marked as outdated.

@helios-aet helios-aet bot temporarily deployed to artemis-test2.artemis.cit.tum.de July 7, 2025 22:45 Inactive
@helios-aet helios-aet bot temporarily deployed to artemis-test4.artemis.cit.tum.de July 7, 2025 22:46 Inactive
@helios-aet helios-aet bot temporarily deployed to artemis-test3.artemis.cit.tum.de July 7, 2025 22:46 Inactive
@helios-aet helios-aet bot temporarily deployed to artemis-test5.artemis.cit.tum.de July 7, 2025 22:48 Inactive
@helios-aet helios-aet bot temporarily deployed to artemis-test6.artemis.cit.tum.de July 7, 2025 22:48 Inactive
@helios-aet helios-aet bot temporarily deployed to artemis-test1.artemis.cit.tum.de July 7, 2025 23:01 Inactive
@egekocabas
Copy link
Member Author

Thank you everybody for testing the PR 🙏🏻

Code changes in Artemis are looking very good 👍 I have just noticed that the Helios Status Client depends on okhttp3 which is another external framework. As we like to avoid adding too many different dependencies (which are hard to maintain from a security point of view), I would highly prefer when the Helios Status Client could use Spring's RestClient which is integrated into Spring anyway.

I have refactored the library and removed the okhttp3 dependency. Now we are using spring boot's RestClient. See the ls1intum/Helios#803 PR for the changes 👍🏻 (Also created a changelog file to keep track of the library changes CHANGELOG.md)

Tested on TS 1,2,3,4,5,6 👍🏻

@helios-aet helios-aet bot temporarily deployed to artemis-test4.artemis.cit.tum.de July 7, 2025 23:13 Inactive
@helios-aet helios-aet bot temporarily deployed to artemis-test5.artemis.cit.tum.de July 7, 2025 23:13 Inactive
Copy link

github-actions bot commented Jul 7, 2025

End-to-End (E2E) Test Results Summary

TestsPassed ☑️Skipped ⚠️Failed ❌️Time ⏱
End-to-End (E2E) Test Report201 ran195 passed3 skipped3 failed57m 8s 172ms
TestResultTime ⏱
End-to-End (E2E) Test Report
e2e/exercise/quiz-exercise/QuizExerciseParticipation.spec.ts
ts.Quiz Exercise Participation › DnD Quiz participation › Student can participate in DnD Quiz❌ failure2m 2s 988ms
e2e/exercise/programming/ProgrammingExerciseParticipation.spec.ts
ts.Programming exercise participation › Programming exercise team participation › Team members make git submissions❌ failure45s 601ms
e2e/exercise/programming/ProgrammingExerciseStaticCodeAnalysis.spec.ts
ts.Static code analysis tests › Configures SCA grading and makes a successful submission with SCA errors❌ failure2m 9s 14ms

@helios-aet helios-aet bot temporarily deployed to artemis-test2.artemis.cit.tum.de July 7, 2025 23:16 Inactive
Copy link
Contributor

@eylulnc eylulnc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reapprove manual testing, mentioned states are observed.
Screenshot 2025-07-08 at 01 14 37

Copy link
Member

@TurkerKoc TurkerKoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reapprove, tested on Helios, mentioned states observed 👍🏻

@egekocabas egekocabas requested a review from krusche July 7, 2025 23:21
@helios-aet helios-aet bot temporarily deployed to artemis-test2.artemis.cit.tum.de July 8, 2025 08:55 Inactive
Copy link
Contributor

@az108 az108 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested on Helios with TS2, badges appear as described
image

@krusche krusche merged commit c2fbb9c into develop Jul 11, 2025
84 of 87 checks passed
@krusche krusche deleted the feature/helios-status-updates branch July 11, 2025 11:04
@github-project-automation github-project-automation bot moved this from Ready For Review to Merged in Artemis Development Jul 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
config-change Pull requests that change the config in a way that they require a deployment via Ansible. core Pull requests that affect the corresponding module ready to merge server Pull requests that update Java code. (Added Automatically!)
Projects
Status: Merged
Development

Successfully merging this pull request may close these issues.

Artemis status updates library