You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It will enable user to run pyspark application in docker container.
What is the feature you are proposing to solve the problem?
py4j is not in the PAYTHONPATH, so one can not run/submit spark application for python application in the docker container. We should add py4j in PYTHONPATH.
What alternatives have you considered?
I am using PYTHONPATH=$(ZIPS=(/opt/bitnami/spark/python/lib/*.zip); IFS=:; echo "${ZIPS[*]}"):$PYTHONPATH, to update the PYTHONPATH.
The text was updated successfully, but these errors were encountered:
Thank you so much for reporting. Would you like to submit a PR updating this environment variable? It could be done in the dockerfile or in the entrypoint, depending on the complexity
@javsalgar Sure, I will create a PR. I think, it will be better to add py4j directly in the Dockerfile where the PYTHONPATH is updated, as it is a crucial library for enabling Python applications to communicate with Spark.
@javsalgar I have created the PR. Initially, I intended to add the changes to the Dockerfile, but it became complex. Therefore, I moved the changes to entrypoint.sh as you suggested.
Name and Version
docker.io/bitnami/spark:3.5.1-debian-12-r12
What is the problem this feature will solve?
It will enable user to run pyspark application in docker container.
What is the feature you are proposing to solve the problem?
py4j
is not in thePAYTHONPATH
, so one can not run/submit spark application for python application in the docker container. We should addpy4j
inPYTHONPATH
.What alternatives have you considered?
I am using
PYTHONPATH=$(ZIPS=(/opt/bitnami/spark/python/lib/*.zip); IFS=:; echo "${ZIPS[*]}"):$PYTHONPATH
, to update thePYTHONPATH
.The text was updated successfully, but these errors were encountered: