-
Notifications
You must be signed in to change notification settings - Fork 0
Tweak obs demo docker init image to pull out sbt steps outside docker build #102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThe pull request introduces significant changes to the Docker setup for the Chronon Observability project. It simplifies the Dockerfile by removing the initial build stage, allowing for direct copying of application artifacts from local paths. A new build script ( Changes
Possibly related PRs
Suggested reviewers
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
💤 Files with no reviewable changes (1)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Outside diff range and nitpick comments (9)
docker-init/frontend/Dockerfile (2)
3-4
: Consider adding validation for frontend build artifactsThe direct copy of the frontend directory might fail silently if the build artifacts aren't present. Consider adding a validation step:
# Copy the built frontend to the new container +RUN [ -d "/app/frontend/build" ] || (echo "Frontend build artifacts not found. Run ./docker-init/build.sh --frontend first" && exit 1) COPY ./frontend /app/frontend
4-4
: Optimize frontend artifacts copyingConsider being more specific about which files to copy to reduce image size:
-COPY ./frontend /app/frontend +COPY ./frontend/build /app/frontend/build +COPY ./frontend/package.json /app/frontend/docker-init/README.md (3)
17-18
: Improve readability with proper punctuationAdd appropriate commas for better readability:
-The build script will build the relevant Chronon modules (Spark, Frontend, Hub) and then trigger the docker container build. While iterating, you can selectively -build modules you care about to speed your dev loop. +The build script will build the relevant Chronon modules (Spark, Frontend, Hub) and then trigger the docker container build. While iterating, you can selectively +build modules you care about to speed up your dev loop.🧰 Tools
🪛 LanguageTool
[typographical] ~18-~18: Consider inserting a comma for improved readability.
Context: ...ating, you can selectively build modules you care about to speed your dev loop. For ...(MISSING_COMMAS)
[typographical] ~18-~18: After the expression ‘for example’ a comma is usually used.
Context: ... care about to speed your dev loop. For example to only rebuild Spark modules: ```bash ...(COMMA_FOR_EXAMPLE)
17-21
: Add more examples of selective buildsConsider adding examples for other modules to make it clearer:
For example to only rebuild Spark modules: ```bash $ ./docker-init/build.sh --spark
+Other available options:
+bash +$ ./docker-init/build.sh --frontend # Build only frontend +$ ./docker-init/build.sh --hub # Build only hub +
<details> <summary>🧰 Tools</summary> <details> <summary>🪛 LanguageTool</summary> [typographical] ~18-~18: Consider inserting a comma for improved readability. Context: ...ating, you can selectively build modules you care about to speed your dev loop. For ... (MISSING_COMMAS) --- [typographical] ~18-~18: After the expression ‘for example’ a comma is usually used. Context: ... care about to speed your dev loop. For example to only rebuild Spark modules: ```bash ... (COMMA_FOR_EXAMPLE) </details> </details> --- `27-30`: **Add troubleshooting section** Consider adding a troubleshooting section to help users with common issues: ```markdown ## Troubleshooting - If you encounter "Module not found" errors, ensure you've run `./docker-init/build.sh` with the appropriate flags - For permission issues, ensure Docker daemon is running and you have necessary permissions - To clean all build artifacts: `./docker-init/build.sh --clean`
🧰 Tools
🪛 Markdownlint (0.35.0)
29-29: null
Dollar signs used before commands without showing output(MD014, commands-show-output)
docker-init/Dockerfile (2)
Line range hint
8-13
: Consider using multi-stage build for Python dependenciesTo reduce the final image size and potential security vulnerabilities, consider using a multi-stage build for Python dependencies:
-COPY docker-init/requirements.txt . -# Python3.8 to allow for numpy==1.22.0 due to security concern -RUN amazon-linux-extras enable python3.8 -RUN yum clean metadata; yum install -y python38 unzip -RUN pip3.8 install --upgrade pip; pip3.8 install -r requirements.txt -ENV PYSPARK_PYTHON=python3.8 +# Use multi-stage build for Python dependencies +FROM python:3.8-slim as python-deps +COPY docker-init/requirements.txt . +RUN pip install --no-cache-dir -r requirements.txt + +FROM amazoncorretto:17 +COPY --from=python-deps /usr/local/lib/python3.8 /usr/local/lib/python3.8 +ENV PYSPARK_PYTHON=python3.8
27-29
: Consider using environment variables for version numbersTo make version updates easier and consistent:
+ARG SCALA_VERSION=2.12 +ARG PROJECT_VERSION=0.1.0-SNAPSHOT -COPY ./spark/target/scala-2.12/spark-assembly-0.1.0-SNAPSHOT.jar /app/cli/spark.jar -COPY ./cloud_aws/target/scala-2.12/cloud_aws-assembly-0.1.0-SNAPSHOT.jar /app/cli/cloud_aws.jar +COPY ./spark/target/scala-${SCALA_VERSION}/spark-assembly-${PROJECT_VERSION}.jar /app/cli/spark.jar +COPY ./cloud_aws/target/scala-${SCALA_VERSION}/cloud_aws-assembly-${PROJECT_VERSION}.jar /app/cli/cloud_aws.jardocker-init/build.sh (2)
1-12
: Add shell options and enhance usage documentationConsider adding shell options for better error handling and including example commands in the usage.
#!/bin/bash +set -euo pipefail function print_usage() { echo "Usage: $0 [OPTIONS]" echo "Options:" echo " --all Build all modules (Spark, Frontend, Hub)" echo " --clean Clean and build all modules" echo " --spark Build Spark modules only" echo " --frontend Build frontend modules only" echo " --hub Build Hub modules only" echo " -h, --help Show this help message" + echo + echo "Examples:" + echo " $0 --hub # Build only the hub web service" + echo " $0 --spark # Build only the Spark assemblies" + echo " $0 --clean --all # Clean and rebuild everything" }
1-88
: Consider build performance optimizationsTo improve the build system's efficiency, consider implementing:
- Build caching using sbt's built-in caching mechanisms
- Parallel execution where possible
- Selective rebuilds based on changed files
Example improvements:
- Add
.sbtopts
file to configure JVM memory and parallel execution- Implement file watching for development mode
- Consider using sbt-revolver for faster iteration cycles
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (4)
docker-init/Dockerfile
(3 hunks)docker-init/README.md
(1 hunks)docker-init/build.sh
(1 hunks)docker-init/frontend/Dockerfile
(1 hunks)
🧰 Additional context used
🪛 LanguageTool
docker-init/README.md
[typographical] ~18-~18: Consider inserting a comma for improved readability.
Context: ...ating, you can selectively build modules you care about to speed your dev loop. For ...
(MISSING_COMMAS)
[typographical] ~18-~18: After the expression ‘for example’ a comma is usually used.
Context: ... care about to speed your dev loop. For example to only rebuild Spark modules: ```bash ...
(COMMA_FOR_EXAMPLE)
🪛 Markdownlint (0.35.0)
docker-init/README.md
29-29: null
Dollar signs used before commands without showing output
(MD014, commands-show-output)
# Use the bitnami spark image to pull the relevan Spark jars for the CLI | ||
FROM bitnami/spark:3.5.2 AS spark-source | ||
|
||
# Start a new stage for the runtime image | ||
# We expect some build artifacts to be pre-populated by running the docker-init/build.sh (or manually triggered sbt commands) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add validation for required build artifacts
Add checks for required build artifacts to fail fast with clear error messages:
# We expect some build artifacts to be pre-populated by running the docker-init/build.sh (or manually triggered sbt commands)
+ARG REQUIRED_FILES="spark/target/scala-2.12/spark-assembly-0.1.0-SNAPSHOT.jar cloud_aws/target/scala-2.12/cloud_aws-assembly-0.1.0-SNAPSHOT.jar hub/target/universal/hub-0.1.0-SNAPSHOT.zip"
+
+# Add a validation step
+COPY validate_artifacts.sh /
+RUN chmod +x /validate_artifacts.sh
+RUN /validate_artifacts.sh $REQUIRED_FILES
Create a new file validate_artifacts.sh
:
#!/bin/bash
for file in "$@"; do
if [ ! -f "$file" ]; then
echo "Error: Required build artifact not found: $file"
echo "Please run './docker-init/build.sh --all' first"
exit 1
fi
done
# No arguments provided | ||
if [ $# -eq 0 ]; then | ||
print_usage | ||
exit 1 | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add directory validation
Ensure the script is being run from the correct directory by checking for essential project files.
+# Validate working directory
+if [[ ! -f "build.sbt" ]]; then
+ echo "Error: build.sbt not found. Please run this script from the project root directory."
+ exit 1
+fi
+
# No arguments provided
if [ $# -eq 0 ]; then
print_usage
exit 1
fi
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
# No arguments provided | |
if [ $# -eq 0 ]; then | |
print_usage | |
exit 1 | |
fi | |
# Validate working directory | |
if [[ ! -f "build.sbt" ]]; then | |
echo "Error: build.sbt not found. Please run this script from the project root directory." | |
exit 1 | |
fi | |
# No arguments provided | |
if [ $# -eq 0 ]; then | |
print_usage | |
exit 1 | |
fi |
# Process command line arguments | ||
while [[ $# -gt 0 ]]; do | ||
case $1 in | ||
--all) | ||
BUILD_ALL=true | ||
shift | ||
;; | ||
--clean) | ||
CLEAN=true | ||
BUILD_ALL=true | ||
shift | ||
;; | ||
--spark) | ||
BUILD_SPARK=true | ||
shift | ||
;; | ||
--frontend) | ||
BUILD_FRONTEND=true | ||
shift | ||
;; | ||
--hub) | ||
BUILD_HUB=true | ||
shift | ||
;; | ||
-h|--help) | ||
print_usage | ||
exit 0 | ||
;; | ||
*) | ||
echo "Unknown option: $1" | ||
print_usage | ||
exit 1 | ||
;; | ||
esac | ||
done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Initialize variables and reconsider --clean behavior
- Variables should be initialized before use
- The --clean option automatically setting BUILD_ALL might be unexpected
+# Initialize variables
+BUILD_ALL=false
+CLEAN=false
+BUILD_SPARK=false
+BUILD_FRONTEND=false
+BUILD_HUB=false
+
# Process command line arguments
while [[ $# -gt 0 ]]; do
case $1 in
--all)
BUILD_ALL=true
shift
;;
--clean)
CLEAN=true
- BUILD_ALL=true
shift
;;
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
# Process command line arguments | |
while [[ $# -gt 0 ]]; do | |
case $1 in | |
--all) | |
BUILD_ALL=true | |
shift | |
;; | |
--clean) | |
CLEAN=true | |
BUILD_ALL=true | |
shift | |
;; | |
--spark) | |
BUILD_SPARK=true | |
shift | |
;; | |
--frontend) | |
BUILD_FRONTEND=true | |
shift | |
;; | |
--hub) | |
BUILD_HUB=true | |
shift | |
;; | |
-h|--help) | |
print_usage | |
exit 0 | |
;; | |
*) | |
echo "Unknown option: $1" | |
print_usage | |
exit 1 | |
;; | |
esac | |
done | |
# Initialize variables | |
BUILD_ALL=false | |
CLEAN=false | |
BUILD_SPARK=false | |
BUILD_FRONTEND=false | |
BUILD_HUB=false | |
# Process command line arguments | |
while [[ $# -gt 0 ]]; do | |
case $1 in | |
--all) | |
BUILD_ALL=true | |
shift | |
;; | |
--clean) | |
CLEAN=true | |
shift | |
;; | |
--spark) | |
BUILD_SPARK=true | |
shift | |
;; | |
--frontend) | |
BUILD_FRONTEND=true | |
shift | |
;; | |
--hub) | |
BUILD_HUB=true | |
shift | |
;; | |
-h|--help) | |
print_usage | |
exit 0 | |
;; | |
*) | |
echo "Unknown option: $1" | |
print_usage | |
exit 1 | |
;; | |
esac | |
done |
echo "Wrapped up Chronon build. Triggering docker compose..." | ||
docker compose -f docker-init/compose.yaml up --build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance Docker Compose execution
The Docker Compose execution needs several improvements:
- Validate Docker installation
- Validate compose file existence
- Add option to skip Docker Compose
- Add error handling
+# Validate Docker installation
+if ! command -v docker >/dev/null 2>&1; then
+ echo "Error: docker is not installed or not in PATH"
+ exit 1
+fi
+
+# Validate compose file
+COMPOSE_FILE="docker-init/compose.yaml"
+if [ ! -f "$COMPOSE_FILE" ]; then
+ echo "Error: Docker Compose file not found at $COMPOSE_FILE"
+ exit 1
+fi
+
+# Add option to skip docker compose
+if [ "${SKIP_DOCKER:-false}" != "true" ]; then
echo "Wrapped up Chronon build. Triggering docker compose..."
- docker compose -f docker-init/compose.yaml up --build
+ if ! docker compose -f "$COMPOSE_FILE" up --build; then
+ echo "Error: Docker Compose failed"
+ exit 1
+ fi
+else
+ echo "Skipping Docker Compose as SKIP_DOCKER=true"
+fi
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
echo "Wrapped up Chronon build. Triggering docker compose..." | |
docker compose -f docker-init/compose.yaml up --build | |
# Validate Docker installation | |
if ! command -v docker >/dev/null 2>&1; then | |
echo "Error: docker is not installed or not in PATH" | |
exit 1 | |
fi | |
# Validate compose file | |
COMPOSE_FILE="docker-init/compose.yaml" | |
if [ ! -f "$COMPOSE_FILE" ]; then | |
echo "Error: Docker Compose file not found at $COMPOSE_FILE" | |
exit 1 | |
fi | |
# Add option to skip docker compose | |
if [ "${SKIP_DOCKER:-false}" != "true" ]; then | |
echo "Wrapped up Chronon build. Triggering docker compose..." | |
if ! docker compose -f "$COMPOSE_FILE" up --build; then | |
echo "Error: Docker Compose failed" | |
exit 1 | |
fi | |
else | |
echo "Skipping Docker Compose as SKIP_DOCKER=true" | |
fi |
# Clean if requested | ||
if [ "$CLEAN" = true ]; then | ||
echo "Cleaning project..." | ||
sbt clean | ||
fi | ||
|
||
# Build all modules | ||
if [ "$BUILD_ALL" = true ]; then | ||
echo "Building all modules..." | ||
sbt assembly | ||
sbt dist | ||
fi | ||
|
||
# Build Spark modules | ||
if [ "$BUILD_SPARK" = true ]; then | ||
echo "Building Spark modules..." | ||
sbt assembly | ||
fi | ||
|
||
# Build Hub modules | ||
if [ "$BUILD_HUB" = true ]; then | ||
echo "Building Hub distribution..." | ||
sbt dist | ||
fi | ||
|
||
# Build frontend | ||
if [ "$BUILD_FRONTEND" = true ]; then | ||
echo "Building frontend distribution..." | ||
sbt "project frontend" buildFrontend | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Enhance build execution robustness and user feedback
Several improvements needed for the build execution:
- Add error handling for sbt commands
- Avoid duplicate sbt commands
- Validate sbt installation
- Add progress indication
+# Validate sbt installation
+if ! command -v sbt >/dev/null 2>&1; then
+ echo "Error: sbt is not installed or not in PATH"
+ exit 1
+fi
+
+# Helper function for sbt commands with error handling
+run_sbt() {
+ echo "Running: sbt $1"
+ if ! sbt "$1"; then
+ echo "Error: sbt command failed"
+ exit 1
+ fi
+}
+
# Clean if requested
if [ "$CLEAN" = true ]; then
echo "Cleaning project..."
- sbt clean
+ run_sbt clean
fi
+# Track if any build was performed
+BUILD_PERFORMED=false
+
# Build all modules
if [ "$BUILD_ALL" = true ]; then
echo "Building all modules..."
- sbt assembly
- sbt dist
+ run_sbt "assembly; dist"
+ BUILD_PERFORMED=true
fi
# Build Spark modules
-if [ "$BUILD_SPARK" = true ]; then
+if [ "$BUILD_SPARK" = true ] && [ "$BUILD_ALL" = false ]; then
echo "Building Spark modules..."
- sbt assembly
+ run_sbt assembly
+ BUILD_PERFORMED=true
fi
# Build Hub modules
-if [ "$BUILD_HUB" = true ]; then
+if [ "$BUILD_HUB" = true ] && [ "$BUILD_ALL" = false ]; then
echo "Building Hub distribution..."
- sbt dist
+ run_sbt dist
+ BUILD_PERFORMED=true
fi
# Build frontend
-if [ "$BUILD_FRONTEND" = true ]; then
- echo "Building frontend distribution..."
- sbt "project frontend" buildFrontend
+if [ "$BUILD_FRONTEND" = true ] && [ "$BUILD_ALL" = false ]; then
+ echo "Building frontend distribution..."
+ run_sbt "project frontend" buildFrontend
+ BUILD_PERFORMED=true
fi
+
+if [ "$BUILD_PERFORMED" = false ] && [ "$CLEAN" = false ]; then
+ echo "Warning: No build targets were specified"
+ print_usage
+ exit 1
+fi
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
# Clean if requested | |
if [ "$CLEAN" = true ]; then | |
echo "Cleaning project..." | |
sbt clean | |
fi | |
# Build all modules | |
if [ "$BUILD_ALL" = true ]; then | |
echo "Building all modules..." | |
sbt assembly | |
sbt dist | |
fi | |
# Build Spark modules | |
if [ "$BUILD_SPARK" = true ]; then | |
echo "Building Spark modules..." | |
sbt assembly | |
fi | |
# Build Hub modules | |
if [ "$BUILD_HUB" = true ]; then | |
echo "Building Hub distribution..." | |
sbt dist | |
fi | |
# Build frontend | |
if [ "$BUILD_FRONTEND" = true ]; then | |
echo "Building frontend distribution..." | |
sbt "project frontend" buildFrontend | |
fi | |
# Validate sbt installation | |
if ! command -v sbt >/dev/null 2>&1; then | |
echo "Error: sbt is not installed or not in PATH" | |
exit 1 | |
fi | |
# Helper function for sbt commands with error handling | |
run_sbt() { | |
echo "Running: sbt $1" | |
if ! sbt "$1"; then | |
echo "Error: sbt command failed" | |
exit 1 | |
fi | |
} | |
# Clean if requested | |
if [ "$CLEAN" = true ]; then | |
echo "Cleaning project..." | |
run_sbt clean | |
fi | |
# Track if any build was performed | |
BUILD_PERFORMED=false | |
# Build all modules | |
if [ "$BUILD_ALL" = true ]; then | |
echo "Building all modules..." | |
run_sbt "assembly; dist" | |
BUILD_PERFORMED=true | |
fi | |
# Build Spark modules | |
if [ "$BUILD_SPARK" = true ] && [ "$BUILD_ALL" = false ]; then | |
echo "Building Spark modules..." | |
run_sbt assembly | |
BUILD_PERFORMED=true | |
fi | |
# Build Hub modules | |
if [ "$BUILD_HUB" = true ] && [ "$BUILD_ALL" = false ]; then | |
echo "Building Hub distribution..." | |
run_sbt dist | |
BUILD_PERFORMED=true | |
fi | |
# Build frontend | |
if [ "$BUILD_FRONTEND" = true ] && [ "$BUILD_ALL" = false ]; then | |
echo "Building frontend distribution..." | |
run_sbt "project frontend" buildFrontend | |
BUILD_PERFORMED=true | |
fi | |
if [ "$BUILD_PERFORMED" = false ] && [ "$CLEAN" = false ]; then | |
echo "Warning: No build targets were specified" | |
print_usage | |
exit 1 | |
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but will want a second set of 👁️ 's too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Summary
Speed up our local obs iteration flow by pulling all the forced sbt clean + build steps outside the docker compose build flow. We don't need to build the spark assembly, frontend and hub every time - we often just need to build one / two of these. This PR pulls the build piece out of the docker file so that we don't have to do it every time. Instead we wrap the build in a script and invoke the relevant build targets. The docker file copies the relevant artifacts over. This allows us to do things like:
Just build the hub webservice:
Just build the spark assemblies:
Checklist
Summary by CodeRabbit
build.sh
) for easier module building and management.