Skip to content

Commit 4251fca

Browse files
authored
Upgrade Quickstart And Push to Cloud (#116)
## Summary Updates versions to get the Quickstart image working, and pushes it to our cloud repos for canary testing. ## Checklist - [ ] Added Unit Tests - [ ] Covered by existing CI - [ x] Integration tested - [ ] Documentation update <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Summary by CodeRabbit - **New Features** - Updated the Dockerfile to use a newer base image and modified the JAR file paths for building the application. - Upgraded the Spark version in the `main` service of the docker-compose configuration. - Introduced a new GitHub Actions workflow to automate the process of pushing Docker images to AWS and GCP. - **Bug Fixes** - Ensured environment variables for Spark and Chronon are correctly retained in the updated configurations. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
1 parent c46f767 commit 4251fca

File tree

3 files changed

+203
-11
lines changed

3 files changed

+203
-11
lines changed

.github/workflows/push_to_canary.yaml

Lines changed: 171 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,171 @@
1+
name: Push To Canary
2+
3+
on:
4+
push:
5+
branches:
6+
- 'main'
7+
8+
concurrency:
9+
group: ${{ github.workflow }}-${{ github.ref }}
10+
cancel-in-progress: true
11+
12+
env:
13+
AWS_ACCOUNT_ID: ${{secrets.AWS_ACCOUNT_ID}}
14+
AWS_QUICKSTART_REPOSITORY: zipline-ai/canary-quickstart
15+
AWS_REGION: ${{secrets.AWS_REGION}}
16+
GCP_PROJECT_ID: ${{secrets.GCP_PROJECT_ID}}
17+
GCP_REGION: ${{secrets.GCP_REGION}}
18+
19+
jobs:
20+
push_to_cloud:
21+
runs-on: ubuntu-latest
22+
23+
permissions:
24+
id-token: write
25+
contents: read
26+
27+
steps:
28+
- uses: actions/checkout@v4
29+
30+
- name: Set up QEMU
31+
uses: docker/setup-qemu-action@v3
32+
33+
- name: Setup JDK
34+
uses: actions/setup-java@v4
35+
with:
36+
distribution: corretto
37+
java-version: 17
38+
39+
- name: Install Thrift
40+
env:
41+
THRIFT_VERSION: 0.21.0
42+
run: |
43+
sudo apt-get install automake bison flex g++ git libboost-all-dev libevent-dev libssl-dev libtool make pkg-config && \
44+
curl -LSs https://archive.apache.org/dist/thrift/${{env.THRIFT_VERSION}}/thrift-${{env.THRIFT_VERSION}}.tar.gz -o thrift-${{env.THRIFT_VERSION}}.tar.gz && \
45+
tar -xzf thrift-${{env.THRIFT_VERSION}}.tar.gz && \
46+
cd thrift-${{env.THRIFT_VERSION}} && \
47+
sudo ./configure --without-python --without-cpp --without-nodejs --without-java --disable-debug --disable-tests --disable-libs && \
48+
sudo make && \
49+
sudo make install && \
50+
cd .. && \
51+
sudo rm -rf thrift-${{env.THRIFT_VERSION}} thrift-${{env.THRIFT_VERSION}}.tar.gz
52+
53+
54+
- name: Build SBT Project
55+
id: sbt-assembly
56+
run: |
57+
sbt clean && sbt assembly
58+
59+
- name: Build AWS Quickstart Image
60+
id: build-aws-app
61+
shell: bash
62+
env:
63+
USER: root
64+
SPARK_SUBMIT_PATH: spark-submit
65+
PYTHONPATH: /srv/chronon
66+
SPARK_VERSION: 3.1.1
67+
JOB_MODE: local[*]
68+
PARALLELISM: 2
69+
EXECUTOR_MEMORY: 2G
70+
EXECUTOR_CORES: 4
71+
DRIVER_MEMORY: 1G
72+
CHRONON_LOG_TABLE: default.chronon_log_table
73+
CHRONON_ONLINE_CLASS: ai.chronon.integrations.aws.AwsApiImpl
74+
AWS_DEFAULT_REGION: ${{env.AWS_REGION}}
75+
DYNAMO_ENDPOINT: https://dynamodb.${{env.AWS_REGION}}.amazonaws.com
76+
JAVA_OPTS: "-Xms1g -Xmx1g"
77+
CLOUD_AWS_JAR: /app/cli/cloud_aws.jar
78+
run:
79+
docker build "." -f "./Dockerfile" -t "aws-quickstart-image:latest"
80+
81+
- name: Build GCP Quickstart Image
82+
id: build-gcp-app
83+
shell: bash
84+
env:
85+
USER: root
86+
SPARK_SUBMIT_PATH: spark-submit
87+
PYTHONPATH: /srv/chronon
88+
SPARK_VERSION: 3.1.1
89+
JOB_MODE: local[*]
90+
PARALLELISM: 2
91+
EXECUTOR_MEMORY: 2G
92+
EXECUTOR_CORES: 4
93+
DRIVER_MEMORY: 1G
94+
CHRONON_LOG_TABLE: default.chronon_log_table
95+
CHRONON_ONLINE_CLASS: ai.chronon.integrations.cloud_gcp.GcpApiImpl
96+
GCP_DEFAULT_REGION: ${{env.GCP_REGION}}
97+
BIGTABLE_ENDPOINT: https://${{env.GCP_REGION}}-bigtable.googleapis.com
98+
JAVA_OPTS: "-Xms1g -Xmx1g"
99+
CLOUD_GCP_JAR: /app/cli/cloud_gcp.jar
100+
run:
101+
docker build "." -f "./Dockerfile" -t "gcp-quickstart-image:latest"
102+
103+
- name: Configure AWS Credentials
104+
uses: aws-actions/configure-aws-credentials@v4
105+
with:
106+
role-to-assume: arn:aws:iam::${{env.AWS_ACCOUNT_ID}}:role/github-canary-updater
107+
aws-region: ${{env.AWS_REGION}}
108+
109+
110+
- name: Login to Amazon ECR
111+
id: login-ecr
112+
uses: aws-actions/amazon-ecr-login@v2
113+
with:
114+
registries: ${{env.AWS_ACCOUNT_ID}}
115+
116+
- name: Tag, and push quickstart image to Amazon ECR
117+
env:
118+
ECR_REPOSITORY: ${{steps.login-ecr.outputs.registry}}/${{env.AWS_QUICKSTART_REPOSITORY}}
119+
IMAGE_TAG: main
120+
shell: bash
121+
run: |
122+
set -eo pipefail
123+
docker tag "aws-quickstart-image:latest" "${{env.ECR_REPOSITORY}}:$IMAGE_TAG"
124+
docker push "${{env.ECR_REPOSITORY}}:$IMAGE_TAG" || {
125+
echo "Failed to push canary tag"
126+
exit 1
127+
}
128+
docker tag "${{env.ECR_REPOSITORY}}:$IMAGE_TAG" "${{env.ECR_REPOSITORY}}:${{github.sha}}"
129+
docker push "${{env.ECR_REPOSITORY}}:${{github.sha}}" || {
130+
echo "Failed to push sha tag"
131+
exit 1
132+
}
133+
echo "IMAGE $IMAGE_TAG is pushed to ${{env.ECR_REPOSITORY}}"
134+
echo "image_tag=$IMAGE_TAG"
135+
echo "full_image=${{env.ECR_REPOSITORY}}:$IMAGE_TAG"
136+
137+
- name: Configure GCP Credentials
138+
uses: google-github-actions/auth@v2
139+
with:
140+
project_id: ${{env.GCP_PROJECT_ID}}
141+
workload_identity_provider: projects/703996152583/locations/global/workloadIdentityPools/github-actions/providers/github
142+
service_account: [email protected]
143+
144+
- name: Set up Google Cloud SDK
145+
uses: google-github-actions/setup-gcloud@v2
146+
147+
- name: Google Cloud Docker Auth
148+
shell: bash
149+
run: |-
150+
gcloud auth configure-docker ${{env.GCP_REGION}}-docker.pkg.dev --quiet
151+
152+
- name: Push Quickstart to Artifact Registry
153+
shell: bash
154+
env:
155+
GAR_QUICKSTART_REPOSITORY: ${{env.GCP_REGION}}-docker.pkg.dev/${{env.GCP_PROJECT_ID}}/canary-images/quickstart
156+
IMAGE_TAG: main
157+
run: |
158+
set -eo pipefail
159+
docker tag "gcp-quickstart-image:latest" "${{env.GAR_QUICKSTART_REPOSITORY}}:$IMAGE_TAG"
160+
docker push "${{env.GAR_QUICKSTART_REPOSITORY}}:$IMAGE_TAG" || {
161+
echo "Failed to push canary tag"
162+
exit 1
163+
}
164+
docker tag "${{env.GAR_QUICKSTART_REPOSITORY}}:$IMAGE_TAG" "${{env.GAR_QUICKSTART_REPOSITORY}}:${{github.sha}}"
165+
docker push "${{env.GAR_QUICKSTART_REPOSITORY}}:${{github.sha}}" || {
166+
echo "Failed to push sha tag"
167+
exit 1
168+
}
169+
echo "IMAGE $IMAGE_TAG is pushed to ${{env.GAR_QUICKSTART_REPOSITORY}}"
170+
echo "image_tag=$IMAGE_TAG"
171+
echo "full_image=${{env.GAR_QUICKSTART_REPOSITORY}}:$IMAGE_TAG"

Dockerfile

Lines changed: 31 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,11 @@
11
# Start from a Debian base image
2-
FROM openjdk:8-jre-slim
2+
FROM openjdk:17-jdk-slim
33

44
# Set this manually before building the image, requires a local build of the jar
55

6-
ENV CHRONON_JAR_PATH=spark/target-embedded/scala-2.12/your_build.jar
6+
ENV CHRONON_JAR_PATH=spark/target/scala-2.12/spark-assembly-0.1.0-SNAPSHOT.jar
7+
ENV CLOUD_AWS_JAR_PATH=cloud_aws/target/scala-2.12/cloud_aws-assembly-0.1.0-SNAPSHOT.jar
8+
ENV CLOUD_GCP_JAR_PATH=cloud_gcp/target/scala-2.12/cloud_gcp-assembly-0.1.0-SNAPSHOT.jar
79

810
# Update package lists and install necessary tools
911
RUN apt-get update && apt-get install -y \
@@ -16,8 +18,8 @@ RUN apt-get update && apt-get install -y \
1618
procps \
1719
python3-pip
1820

19-
ENV THRIFT_VERSION 0.13.0
20-
ENV SCALA_VERSION 2.12.12
21+
ENV THRIFT_VERSION 0.21.0
22+
ENV SCALA_VERSION 2.12.18
2123

2224
# Install thrift
2325
RUN curl -sSL "http://archive.apache.org/dist/thrift/$THRIFT_VERSION/thrift-$THRIFT_VERSION.tar.gz" -o thrift.tar.gz \
@@ -43,8 +45,8 @@ ENV PATH=${PATH}:${SCALA_HOME}/bin
4345
# Optional env variables
4446
ENV SPARK_HOME=${SPARK_HOME:-"/opt/spark"}
4547
ENV HADOOP_HOME=${HADOOP_HOME:-"/opt/hadoop"}
46-
ENV SPARK_VERSION=${SPARK_VERSION:-"3.1.1"}
47-
ENV HADOOP_VERSION=${HADOOP_VERSION:-"3.2"}
48+
ENV SPARK_VERSION=${SPARK_VERSION:-"3.5.1"}
49+
ENV HADOOP_VERSION=${HADOOP_VERSION:-"3"}
4850
RUN mkdir -p ${HADOOP_HOME} && mkdir -p ${SPARK_HOME}
4951
RUN mkdir -p /opt/spark/spark-events
5052
WORKDIR ${SPARK_HOME}
@@ -54,12 +56,10 @@ RUN curl https://archive.apache.org/dist/spark/spark-${SPARK_VERSION}/spark-${SP
5456
&& tar xvzf spark.tgz --directory /opt/spark --strip-components 1 \
5557
&& rm -rf spark.tgz
5658

57-
5859
# Install python deps
5960
COPY quickstart/requirements.txt .
6061
RUN pip3 install -r requirements.txt
6162

62-
6363
ENV PATH="/opt/spark/sbin:/opt/spark/bin:${PATH}"
6464
ENV SPARK_HOME="/opt/spark"
6565

@@ -76,9 +76,32 @@ WORKDIR ${SPARK_HOME}
7676
WORKDIR /srv/chronon
7777

7878
ENV DRIVER_JAR_PATH="/srv/spark/spark_embedded.jar"
79+
ENV CLOUD_AWS_JAR=${CLOUD_AWS_JAR:-"/srv/cloud_aws/cloud_aws.jar"}
80+
ENV CLOUD_GCP_JAR=${CLOUD_GCP_JAR:-"/srv/cloud_gcp/cloud_gcp.jar"}
7981

8082
COPY api/py/test/sample ./
8183
COPY quickstart/mongo-online-impl /srv/onlineImpl
8284
COPY $CHRONON_JAR_PATH "$DRIVER_JAR_PATH"
85+
COPY $CLOUD_AWS_JAR_PATH "$CLOUD_AWS_JAR"
86+
COPY $CLOUD_GCP_JAR_PATH "$CLOUD_GCP_JAR"
8387

8488
ENV CHRONON_DRIVER_JAR="$DRIVER_JAR_PATH"
89+
90+
ENV SPARK_SUBMIT_OPTS="\
91+
-XX:MaxMetaspaceSize=1024m \
92+
--add-opens=java.base/java.lang=ALL-UNNAMED \
93+
--add-opens=java.base/java.lang.invoke=ALL-UNNAMED \
94+
--add-opens=java.base/java.lang.reflect=ALL-UNNAMED \
95+
--add-opens=java.base/java.io=ALL-UNNAMED \
96+
--add-opens=java.base/java.net=ALL-UNNAMED \
97+
--add-opens=java.base/java.nio=ALL-UNNAMED \
98+
--add-opens=java.base/java.util=ALL-UNNAMED \
99+
--add-opens=java.base/java.util.concurrent=ALL-UNNAMED \
100+
--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED \
101+
--add-opens=java.base/sun.nio.ch=ALL-UNNAMED \
102+
--add-opens=java.base/sun.nio.cs=ALL-UNNAMED \
103+
--add-opens=java.base/sun.security.action=ALL-UNNAMED \
104+
--add-opens=java.base/sun.util.calendar=ALL-UNNAMED \
105+
--add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED"
106+
107+
CMD tail -f /dev/null

docker-compose.yml

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,4 @@
11
# Quickstart Docker containers to run chronon commands with MongoDB as the KV Store.
2-
version: '3.8'
3-
42
services:
53

64
mongodb:
@@ -44,7 +42,7 @@ services:
4442
- USER=root
4543
- SPARK_SUBMIT_PATH=spark-submit
4644
- PYTHONPATH=/srv/chronon
47-
- SPARK_VERSION=3.1.1
45+
- SPARK_VERSION=3.5.1
4846
- JOB_MODE=local[*]
4947
- PARALLELISM=2
5048
- EXECUTOR_MEMORY=2G

0 commit comments

Comments
 (0)