-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Arch] Shrink runtime image size #3051
Conversation
# THIS blows up the image by 13 GB!!! | ||
# Install (or update) the dependencies | ||
# dockerfile_content += ( | ||
# 'RUN cd /opendevin/code && \\\n' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this used result from mamba related package?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I actually made a mistake, it's not only 13 GB, but more like 22 GB on top.
The resulting sandbox was ~ 26GB. And most of that happens when mamba is installing and running poetry.
Idk what is happening there, but it ain't good. 😁
Some snippet that may help in dockerfile_content += (
'ENV DEBIAN_FRONTEND=noninteractive '
'POETRY_NO_INTERACTION=1 '
'POETRY_VIRTUALENVS_IN_PROJECT=1 '
'POETRY_VIRTUALENVS_CREATE=1 '
'POETRY_CACHE_DIR=/tmp/poetry_cache\n'
) The above are used in other dockerfiles and they e.g. assure that |
@tobitege I think i finally got it working (at least locally). I did not do anything extra, but just remove that line of |
I'll send Sonnet the bill then lolol |
I've experimented a little bit more and updated my PR's sources. You might take another peek at:
# Check if the base image already has the 'od_runtime' prefix
if repo.startswith('od_runtime'):
return f'{repo}:{tag}' The way the above "nodejs" method is coded is that it doesn't require all the mamba stuff, so commands could just be run via I'll just wanted to share this as "food for thought", even though experimental code. 😀 |
Btw, I think there's one big thing missing in the source copying: the |
@tobitege this itself should be ready for review if it is ok with you! I'm experimenting with a slightly more general/cleaner way of building OpenDevin runtime images (e.g., using Jinja templates), which hopefully can provide a cleaner way to deal with these docker files in Python, and make it easier to extend these dockerfiles to different languages (e.g., nodejs).
The only previous reason for including I guess if we get the Jinja template working nicely and cleanly, we could just start building every image with |
Yep, because |
Hmm... are you sure? 😂 I don't see how the agentskills, ipython calls and browser stuff will work solely outside of the sandbox and only send commands... but it's getting late and I shouldn't get myself confused in the evening. 😁 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not yet quite ready.
@@ -81,14 +81,15 @@ async def ainit(self, env_vars: dict[str, str] | None = None): | |||
# NOTE: You can need set DEBUG=true to update the source code | |||
# inside the container. This is useful when you want to test/debug the | |||
# latest code in the runtime docker container. | |||
update_source_code=config.debug, | |||
update_source_code=self.sandbox_config.update_source_code, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be helpful! 👍
'/opendevin/miniforge3/bin/mamba run -n base poetry install\n' | ||
# for browser (update if needed) | ||
'RUN apt-get update && cd /opendevin/code && /opendevin/miniforge3/bin/mamba run -n base poetry run playwright install --with-deps chromium\n' | ||
'/opendevin/miniforge3/bin/mamba run -n base poetry install --no-interaction --no-root\n' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add --only main
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no pyproject.toml file in the sandbox with this version of the method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm... how did those get there, the "tar" doesn't copy them over, does it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the .tar
does copy that over as part of the python source build
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And i added this to include the .lock
as well:
https://github.com/OpenDevin/OpenDevin/blob/f3c23e80395bbe95fb8935ff1f8b2b6e563e3a5f/pyproject.toml#L9
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the
.tar
does copy that over as part of the python source build
Ok, that I must have missed. Can you point me to where exactly that happens or do you mean "make build" here? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep - it is here:
https://github.com/OpenDevin/OpenDevin/blob/f3c23e80395bbe95fb8935ff1f8b2b6e563e3a5f/opendevin/runtime/utils/runtime_build.py#L22
We basically create a Python source distribution: https://docs.python.org/3.10/distutils/sourcedist.html
LIBGL_MESA = 'libgl1' | ||
else: | ||
LIBGL_MESA = 'libgl1-mesa-glx' | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would add this at the top:
dockerfile_content = (
'ENV DEBIAN_FRONTEND=noninteractive '
'POETRY_NO_INTERACTION=1 '
'POETRY_VIRTUALENVS_IN_PROJECT=1 '
'POETRY_VIRTUALENVS_CREATE=1 '
'POETRY_CACHE_DIR=/tmp/poetry_cache\n'
)
(and "+=" in the below line then)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any particular reason why we should add this when not adding it still works? I'm trying to see if we can make things as simple as possible..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're also in the app dockerfile, thought they'd be needed:
https://github.com/OpenDevin/OpenDevin/blob/f3c23e80395bbe95fb8935ff1f8b2b6e563e3a5f/containers/app/Dockerfile#L18-L21
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 https://python-poetry.org/docs/configuration/#virtualenvsin-project
If not set explicitly, poetry by default will create a virtual environment under {cache-dir}/virtualenvs or use the {project-dir}/.venv directory if one already exists.
Now i see why the previous runs fails, because after rm -rf /root/.cache/pypoetry
the virtual env will be removed.
I don't think we need this for the sandbox, because we will rm -rf /opendevin/code
and copy a new version of the codebase there, if we enable POETRY_VIRTUALENVS_IN_PROJECT
, it will delete the virtual env with rm -rf /opendevin/code
, and we will have to re-install everything from scratch whenever we update the source code..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And POETRY_VIRTUALENVS_CREATE
also default to true, so we probably don't need to set it either. https://python-poetry.org/docs/configuration/#virtualenvscreate
Thanks @tobitege ! |
Thanks to you as well, happy to help! 🤗 |
What is the problem that this fixes or functionality that this introduces? Does it fix any open issues?
Previously, the runtime image was too large (~20+G) and caused a lot of headaches for deployment. @tobitege did a great job shrink the size in #3040.
Give a summary of what the PR does, explaining any non-trivial design decisions
This PR moves these sandbox image shrinking commits out of #3040 and tries to fix some existing issues. This will better prepare for an automated CI to build the Runtime image.
Other references