-
Notifications
You must be signed in to change notification settings - Fork 44
Packaging tools jar with the python package #1634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Sayed Bilal Bari <[email protected]>
Signed-off-by: Sayed Bilal Bari <[email protected]>
Signed-off-by: Sayed Bilal Bari <[email protected]>
Signed-off-by: Sayed Bilal Bari <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started to look at the changes.
Isn't supposed to be a change in the README.md
to update the instructions on how to start with the new change?
For example, what is the assumption? If we do a pip install -e .
and run, then the jar will be downloaded from mvn at runtime?
# prepackage_mgr.py file downloads the dependencies for the csp-related resources | ||
# in case of fat mode. | ||
# In case of non-fat mode, it just copies the tools jar into the tools-resources folder | ||
# --fetch_all_csp=True toggles the fat/non-fat mode for the script |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the name fetch_all_csp
is not good in that option. If this is a flag indicating Fat mode
, then why don't we use something more descriptive like fat_mode_enabled
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will update the comments to make this more clear. So previous behavior was -
- in case fat passed, then build_jar, download_all_dependencies, compress and add it to the wheel
- in case no, then just create the whl from the source files
For this ./build.sh fat
vs ./build.sh
triggered either cases
Now to replicate the same, script calling behavior , we still call either ./build.sh fat
and ./build.sh
. The difference now, fat mode packs all the csp related dependencies by compressing it into a tgz file and includes it in the resources.
fetch_all_csp
flag is for the internal prepackage_mgr to include the jar or the jar/csp resources. Hence --fetch_all_csp
as a param.
Let me know if this still seems wrong, will update it in that case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not get the new behavior.
for both build options (fat and non-fat), do the tools-jar go into tools-resources
or it depends on the build mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For both build modes, the tools-jar goes into the resources.
In the fat mode is that all csp dependencies are also packed with the jar.
The behavior is not correct after the change. Merging the current behavior will break miserably on all pipelines and sites that relies on running the spark_rapids gave me the following error:
The user-tools should do the following:
The above steps guarantee that the pipelines will keep working until we finish upgrading them. |
Signed-off-by: Sayed Bilal Bari <[email protected]>
Thanks @amahussein for the review. Have pushed changes to allow fallback to maven download in case of building from source. |
Yes, currently only in case of package built using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sayedbilalbari !
Can you merge latest dev branch to update the build-version. This might have impact on testing your branch especially when retrieving the jar from mvn.
# prepackage_mgr.py file downloads the dependencies for the csp-related resources | ||
# in case of fat mode. | ||
# In case of non-fat mode, it just copies the tools jar into the tools-resources folder | ||
# --fetch_all_csp=True toggles the fat/non-fat mode for the script |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not get the new behavior.
for both build options (fat and non-fat), do the tools-jar go into tools-resources
or it depends on the build mode?
@@ -37,6 +38,8 @@ TOOLS_JAR_FILE="" | |||
|
|||
|
|||
# Function to run mvn command to build the tools jar | |||
# This function skips the test cases and builds the jar file and only | |||
# picks the jar file without sources/javadoc/tests.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to pass arguments to the build_jar function.
This probably should come from build.sh
so that we can have different builds for different jobs.
For example, in #1639 I added a profile-id release
. We should be pass that argument down to the mvn command in order to create the specific build we need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @amahussein . I am evaluating the final behavior of this script on the basis of the internal build jobs and post having finalised the design, will update the build.sh file to take appropriate params.
Signed-off-by: Sayed Bilal Bari <[email protected]>
Signed-off-by: Sayed Bilal Bari <[email protected]>
Signed-off-by: Sayed Bilal Bari <[email protected]>
Marking this as draft as per offline discussion that the changes are being evaluated by CI/CD internal team first. |
The changes have been tested internally. Waiting on the final check for it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sayedbilalbari !
Lets take a second look from @parthosa because he is familiar with the build script.
Signed-off-by: Sayed Bilal Bari <[email protected]>
Thanks @parthosa . Have updated with the changes |
Signed-off-by: Sayed Bilal Bari <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sayedbilalbari. LGTME.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sayedbilalbari !
LGTME
The build script has been tested for the fat/non-fat internal CI/CD jobs. Looks good. Merging |
Fixes #1496
This PR enables changes for packaging the tools jar with the python whl file. This avoids downloading the latest jar from mvn each time any cli command is run.
This packages the jar from source and bundles it in the wheel in the folder tools_resources.
This changes the build process by packaging the jar as part of the wheel in all build scenarios. In case of a fat build, it also packages the csp-resources, csp-resources.tgz along with the tools jar.
The user_tools expects the tools jar now as part of the resources. In case not available, it still falls back to downloading it from maven
The changes have been tested locally using the --platform onprem. Since there are no platform specific changes needed, it should be sufficient.
This pull request includes changes to the
user_tools/build.sh
script and related files to add support for a "non-fat" mode, which only downloads necessary resources instead of all dependencies. Additionally, it introduces a new folder structure to separate tools resources from CSP resources.Build script enhancements:
user_tools/build.sh
: Added a new variableTOOLS_RESOURCE_FOLDER
to define the tools resources folder and updated thedownload_web_dependencies
function to handle both fat and non-fat modes. [1] [2]user_tools/build.sh
: Modified thebuild
function to incorporate the new build modes and ensure the appropriate resources are downloaded and organized.Python tool context updates:
user_tools/src/spark_rapids_pytools/rapids/tool_ctxt.py
: Introduced a new class variabletools_resource_path
to manage tools resources and updated methods to handle the new folder structure and non-fat mode. [1] [2] [3] [4] [5] [6]Resource manager script changes:
user_tools/src/spark_rapids_pytools/resources/dev/prepackage_mgr.py
: Added support for thefetch_all_csp
flag to control whether all CSP resources are downloaded or just the tools jar. Updated the script to handle the new tools resources directory. [1] [2] [3] [4] [5] [6]Other minor changes:
user_tools/pyproject.toml
: Added*.jar
to the package data to ensure the tools jar is included in the package.