-
Notifications
You must be signed in to change notification settings - Fork 0
Add support for running fetcher in docker & publishing image #422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis pull request introduces several new assets for streamlined GCP deployment. A new script checks repository state, builds required JARs via Bazel, and constructs a Docker image using Docker Buildx. Additionally, a dedicated Dockerfile for the fetcher service is provided, along with logging configurations and a startup script that verifies environment variables and selects runtime parameters based on the cloud provider. The changes collectively automate artifact building, container preparation, and service initialization for a robust GCP deployment. Changes
Sequence Diagram(s)sequenceDiagram
participant U as User
participant S as Publish Script
participant G as Git
participant B as Bazel
participant D as Docker Engine
U->>S: Execute publish_gcp_docker_images.sh
S->>G: Check for uncommitted changes & branch sync
G-->>S: Return branch status
S->>B: Build cloud_gcp_lib_deploy.jar & service_assembly_deploy.jar
B-->>S: Provide JAR artifacts
S->>S: Copy JARs to build_output directory
S->>D: Login and build Docker image (buildx)
D-->>S: Image built successfully
S->>S: Cleanup build_output directory
sequenceDiagram
participant U as User
participant S as Start Script
participant Env as Environment
participant JVM as Java Process
U->>S: Run start.sh
S->>Env: Validate required variables (FETCHER_JAR, STATSD_HOST, FETCHER_PORT)
Env-->>S: Return environment values
S->>S: Determine JAR selection based on USE_AWS flag
S->>JVM: Launch Java process with proper options and parameters
JVM-->>S: Service starts running
Possibly related PRs
Suggested reviewers
Poem
Warning Review ran into problems🔥 ProblemsGitHub Actions and Pipeline Checks: Resource not accessible by integration - https://docs.github.com/rest/actions/workflow-runs#list-workflow-runs-for-a-repository. Please grant the required permissions to the CodeRabbit GitHub App under the organization or repository settings. ✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
distribution/publish_gcp_docker_images.sh (1)
58-64
: Quote git substitution in image tag.- -t ziplineai/chronon-fetcher:$(git rev-parse --short HEAD) \ + -t "ziplineai/chronon-fetcher:$(git rev-parse --short HEAD)" \🧰 Tools
🪛 Shellcheck (0.10.0)
[warning] 61-61: Quote this to prevent word splitting.
(SC2046)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro (Legacy)
📒 Files selected for processing (4)
distribution/publish_gcp_docker_images.sh
(1 hunks)docker/fetcher/Dockerfile
(1 hunks)docker/fetcher/logging.properties
(1 hunks)docker/fetcher/start.sh
(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- docker/fetcher/logging.properties
🧰 Additional context used
🪛 Shellcheck (0.10.0)
distribution/publish_gcp_docker_images.sh
[warning] 61-61: Quote this to prevent word splitting.
(SC2046)
🪛 Checkov (3.2.334)
docker/fetcher/Dockerfile
[HIGH] 26-28: Ensure that certificate validation isn't disabled with curl
(CKV2_DOCKER_2)
🔇 Additional comments (25)
docker/fetcher/start.sh (5)
1-2
: Shebang and error handling OK.
4-11
: Env var check is concise.
13-19
: AWS/GCP branching looks good.
21-24
: JMX opts and startup echo are fine.
25-35
: Java command: verify JVM opts order.distribution/publish_gcp_docker_images.sh (10)
4-7
: Uncommitted changes check works.
9-11
: Branch name detection is clear.
12-20
: Branch sync check is valid.
22-29
: Project root setup is straightforward.
30-34
: Bazel jar builds are proper.
35-37
: JAR path vars set correctly.
38-46
: Robust existence checks for jars.
48-54
: Copying jars to build_output is clear.
55-57
: Docker login step is standard.
66-67
: Cleanup command is fine.docker/fetcher/Dockerfile (10)
1-3
: Base image selection is solid.
4-12
: ENV vars for jar paths and classes look good.
13-23
: Dependency installation is concise.
30-32
: Scala env vars configured.
33-33
: WORKDIR is set properly.
35-39
: App ENV defaults are clear.
40-44
: COPY commands appear correct.
46-46
: Port env var is fine.
48-52
: HEALTHCHECK and log dir setup are solid.
53-54
: Launch command invokes start script.
ENV SCALA_VERSION 2.12.18 | ||
|
||
RUN curl https://downloads.lightbend.com/scala/${SCALA_VERSION}/scala-${SCALA_VERSION}.deb -k -o scala.deb && \ | ||
apt install -y ./scala.deb && \ | ||
rm -rf scala.deb /var/lib/apt/lists/* | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scala setup: remove '-k' to enforce cert validation.
- RUN curl https://downloads.lightbend.com/scala/${SCALA_VERSION}/scala-${SCALA_VERSION}.deb -k -o scala.deb && \
+ RUN curl https://downloads.lightbend.com/scala/${SCALA_VERSION}/scala-${SCALA_VERSION}.deb -o scala.deb && \
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
ENV SCALA_VERSION 2.12.18 | |
RUN curl https://downloads.lightbend.com/scala/${SCALA_VERSION}/scala-${SCALA_VERSION}.deb -k -o scala.deb && \ | |
apt install -y ./scala.deb && \ | |
rm -rf scala.deb /var/lib/apt/lists/* | |
ENV SCALA_VERSION 2.12.18 | |
RUN curl https://downloads.lightbend.com/scala/${SCALA_VERSION}/scala-${SCALA_VERSION}.deb -o scala.deb && \ | |
apt install -y ./scala.deb && \ | |
rm -rf scala.deb /var/lib/apt/lists/* |
🧰 Tools
🪛 Checkov (3.2.334)
[HIGH] 26-28: Ensure that certificate validation isn't disabled with curl
(CKV2_DOCKER_2)
## Summary Add support to run the fetcher service in docker. Also add rails to publish to docker hub as a private image - [ziplineai/chronon-fetcher](https://hub.docker.com/repository/docker/ziplineai/chronon-fetcher) I wasn't able to sort out logback / log4j2 logging as there's a lot of deps messing things up - Vert.x supports JUL configs and that is seemingly working so starting with that for now. Tested with: ``` docker run -v ~/.config/gcloud/application_default_credentials.json:/gcp/credentials.json \ -p 9000:9000 \ -e "GCP_PROJECT_ID=canary-443022" \ -e "GOOGLE_CLOUD_PROJECT=canary-443022" \ -e "GCP_BIGTABLE_INSTANCE_ID=zipline-canary-instance" \ -e "STATSD_HOST=127.0.0.1" \ -e GOOGLE_APPLICATION_CREDENTIALS=/gcp/credentials.json \ ziplineai/chronon-fetcher ``` And then you can `curl http://localhost:9000/ping` On Etsy side just need to swap out the project and bt instance id and then can curl the actual join: ``` curl -X POST http://localhost:9000/v1/fetch/join/search.ranking.v1_web_zipline_cdc_and_beacon_external -H 'Content-Type: application/json' -d '[{"listing_id":"632126370","shop_id":"53908089","shipping_profile_id":"235561688531"}]' {"results":[{"status":"Success","entityKeys":{"listing_id":"632126370","shop_id":"53908089","shipping_profile_id":"235561688531"},"features":{... ``` ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [X] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Added an automation script that streamlines the container image build and publication process with improved error handling. - Introduced a new container configuration that installs essential dependencies, sets environment variables, and incorporates a health check for enhanced reliability. - Implemented a robust logging setup that standardizes console and file outputs with log rotation. - Provided a startup script for the service that verifies required settings and applies platform-specific options for seamless execution. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Add support to run the fetcher service in docker. Also add rails to publish to docker hub as a private image - [ziplineai/chronon-fetcher](https://hub.docker.com/repository/docker/ziplineai/chronon-fetcher) I wasn't able to sort out logback / log4j2 logging as there's a lot of deps messing things up - Vert.x supports JUL configs and that is seemingly working so starting with that for now. Tested with: ``` docker run -v ~/.config/gcloud/application_default_credentials.json:/gcp/credentials.json \ -p 9000:9000 \ -e "GCP_PROJECT_ID=canary-443022" \ -e "GOOGLE_CLOUD_PROJECT=canary-443022" \ -e "GCP_BIGTABLE_INSTANCE_ID=zipline-canary-instance" \ -e "STATSD_HOST=127.0.0.1" \ -e GOOGLE_APPLICATION_CREDENTIALS=/gcp/credentials.json \ ziplineai/chronon-fetcher ``` And then you can `curl http://localhost:9000/ping` On Etsy side just need to swap out the project and bt instance id and then can curl the actual join: ``` curl -X POST http://localhost:9000/v1/fetch/join/search.ranking.v1_web_zipline_cdc_and_beacon_external -H 'Content-Type: application/json' -d '[{"listing_id":"632126370","shop_id":"53908089","shipping_profile_id":"235561688531"}]' {"results":[{"status":"Success","entityKeys":{"listing_id":"632126370","shop_id":"53908089","shipping_profile_id":"235561688531"},"features":{... ``` ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [X] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Added an automation script that streamlines the container image build and publication process with improved error handling. - Introduced a new container configuration that installs essential dependencies, sets environment variables, and incorporates a health check for enhanced reliability. - Implemented a robust logging setup that standardizes console and file outputs with log rotation. - Provided a startup script for the service that verifies required settings and applies platform-specific options for seamless execution. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Add support to run the fetcher service in docker. Also add rails to publish to docker hub as a private image - [ziplineai/chronon-fetcher](https://hub.docker.com/repository/docker/ziplineai/chronon-fetcher) I wasn't able to sort out logback / log4j2 logging as there's a lot of deps messing things up - Vert.x supports JUL configs and that is seemingly working so starting with that for now. Tested with: ``` docker run -v ~/.config/gcloud/application_default_credentials.json:/gcp/credentials.json \ -p 9000:9000 \ -e "GCP_PROJECT_ID=canary-443022" \ -e "GOOGLE_CLOUD_PROJECT=canary-443022" \ -e "GCP_BIGTABLE_INSTANCE_ID=zipline-canary-instance" \ -e "STATSD_HOST=127.0.0.1" \ -e GOOGLE_APPLICATION_CREDENTIALS=/gcp/credentials.json \ ziplineai/chronon-fetcher ``` And then you can `curl http://localhost:9000/ping` On Etsy side just need to swap out the project and bt instance id and then can curl the actual join: ``` curl -X POST http://localhost:9000/v1/fetch/join/search.ranking.v1_web_zipline_cdc_and_beacon_external -H 'Content-Type: application/json' -d '[{"listing_id":"632126370","shop_id":"53908089","shipping_profile_id":"235561688531"}]' {"results":[{"status":"Success","entityKeys":{"listing_id":"632126370","shop_id":"53908089","shipping_profile_id":"235561688531"},"features":{... ``` ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [X] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Added an automation script that streamlines the container image build and publication process with improved error handling. - Introduced a new container configuration that installs essential dependencies, sets environment variables, and incorporates a health check for enhanced reliability. - Implemented a robust logging setup that standardizes console and file outputs with log rotation. - Provided a startup script for the service that verifies required settings and applies platform-specific options for seamless execution. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Add support to run the fetcher service in docker. Also add rails to publish to docker hub as a private image - [ziplineai/chronon-fetcher](https://hub.docker.com/repository/docker/ziplineai/chronon-fetcher) I wasn't able to sort out logback / log4j2 logging as there's a lot of deps messing things up - Vert.x supports JUL configs and that is seemingly working so starting with that for now. Tested with: ``` docker run -v ~/.config/gcloud/application_default_credentials.json:/gcp/credentials.json \ -p 9000:9000 \ -e "GCP_PROJECT_ID=canary-443022" \ -e "GOOGLE_CLOUD_PROJECT=canary-443022" \ -e "GCP_BIGTABLE_INSTANCE_ID=zipline-canary-instance" \ -e "STATSD_HOST=127.0.0.1" \ -e GOOGLE_APPLICATION_CREDENTIALS=/gcp/credentials.json \ ziplineai/chronon-fetcher ``` And then you can `curl http://localhost:9000/ping` On our clients side just need to swap out the project and bt instance id and then can curl the actual join: ``` curl -X POST http://localhost:9000/v1/fetch/join/search.ranking.v1_web_zipline_cdc_and_beacon_external -H 'Content-Type: application/json' -d '[{"listing_id":"632126370","shop_id":"53908089","shipping_profile_id":"235561688531"}]' {"results":[{"status":"Success","entityKeys":{"listing_id":"632126370","shop_id":"53908089","shipping_profile_id":"235561688531"},"features":{... ``` ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [X] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Added an automation script that streamlines the container image build and publication process with improved error handling. - Introduced a new container configuration that installs essential dependencies, sets environment variables, and incorporates a health check for enhanced reliability. - Implemented a robust logging setup that standardizes console and file outputs with log rotation. - Provided a startup script for the service that verifies required settings and applies platform-specific options for seamless execution. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary Add support to run the fetcher service in docker. Also add rails to publish to docker hub as a private image - [ziplineai/chronon-fetcher](https://hub.docker.com/repository/docker/ziplineai/chronon-fetcher) I wasn't able to sort out logback / log4j2 logging as there's a lot of deps messing things up - Vert.x supports JUL configs and that is seemingly working so starting with that for now. Tested with: ``` docker run -v ~/.config/gcloud/application_default_credentials.json:/gcp/credentials.json \ -p 9000:9000 \ -e "GCP_PROJECT_ID=canary-443022" \ -e "GOOGLE_CLOUD_PROJECT=canary-443022" \ -e "GCP_BIGTABLE_INSTANCE_ID=zipline-canary-instance" \ -e "STATSD_HOST=127.0.0.1" \ -e GOOGLE_APPLICATION_CREDENTIALS=/gcp/credentials.json \ ziplineai/chronon-fetcher ``` And then you can `curl http://localhost:9000/ping` On our clients side just need to swap out the project and bt instance id and then can curl the actual join: ``` curl -X POST http://localhost:9000/v1/fetch/join/search.ranking.v1_web_zipline_cdc_and_beacon_external -H 'Content-Type: application/json' -d '[{"listing_id":"632126370","shop_id":"53908089","shipping_profile_id":"235561688531"}]' {"results":[{"status":"Success","entityKeys":{"listing_id":"632126370","shop_id":"53908089","shipping_profile_id":"235561688531"},"features":{... ``` ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [X] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Added an automation script that streamlines the container image build and publication process with improved error handling. - Introduced a new container configuration that installs essential dependencies, sets environment variables, and incorporates a health check for enhanced reliability. - Implemented a robust logging setup that standardizes console and file outputs with log rotation. - Provided a startup script for the service that verifies required settings and applies platform-specific options for seamless execution. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
…#422) ## Summary Add support to run the fetcher service in doour clientser. Also add rails to publish to doour clientser hub as a private image - [ziplineai/chronon-fetcher](https://hub.doour clientser.com/repository/doour clientser/ziplineai/chronon-fetcher) I wasn't able to sort out logbaour clients / log4j2 logging as there's a lot of deps messing things up - Vert.x supports JUL configs and that is seemingly working so starting with that for now. Tested with: ``` doour clientser run -v ~/.config/gcloud/application_default_credentials.json:/gcp/credentials.json \ -p 9000:9000 \ -e "GCP_PROJECT_ID=canary-443022" \ -e "GOOGLE_CLOUD_PROJECT=canary-443022" \ -e "GCP_BIGTABLE_INSTANCE_ID=zipline-canary-instance" \ -e "STATSD_HOST=127.0.0.1" \ -e GOOGLE_APPLICATION_CREDENTIALS=/gcp/credentials.json \ ziplineai/chronon-fetcher ``` And then you can `curl http://localhost:9000/ping` On our clients side just need to swap out the project and bt instance id and then can curl the actual join: ``` curl -X POST http://localhost:9000/v1/fetch/join/search.ranking.v1_web_zipline_cdc_and_beacon_external -H 'Content-Type: application/json' -d '[{"listing_id":"632126370","shop_id":"53908089","shipping_profile_id":"235561688531"}]' {"results":[{"status":"Success","entityKeys":{"listing_id":"632126370","shop_id":"53908089","shipping_profile_id":"235561688531"},"features":{... ``` ## Cheour clientslist - [ ] Added Unit Tests - [ ] Covered by existing CI - [X] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Added an automation script that streamlines the container image build and publication process with improved error handling. - Introduced a new container configuration that installs essential dependencies, sets environment variables, and incorporates a health cheour clients for enhanced reliability. - Implemented a robust logging setup that standardizes console and file outputs with log rotation. - Provided a startup script for the service that verifies required settings and applies platform-specific options for seamless execution. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
Add support to run the fetcher service in docker. Also add rails to publish to docker hub as a private image - ziplineai/chronon-fetcher
I wasn't able to sort out logback / log4j2 logging as there's a lot of deps messing things up - Vert.x supports JUL configs and that is seemingly working so starting with that for now.
Tested with:
And then you can
curl http://localhost:9000/ping
On Etsy side just need to swap out the project and bt instance id and then can curl the actual join:
Checklist
Summary by CodeRabbit