Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tidy app entrypoint #7668

Merged
merged 8 commits into from
Feb 28, 2025
Merged

Tidy app entrypoint #7668

merged 8 commits into from
Feb 28, 2025

Conversation

RyanJDick
Copy link
Contributor

@RyanJDick RyanJDick commented Feb 21, 2025

Summary

Prior to this PR, most of the app setup was being done in api_app.py at import time. This PR cleans this up, by:

  • Splitting app setup into more modular functions
  • Narrower responsibility for the api_app.py file - it just initializes the FastAPI app

The main motivation for this changes is to make it easier to support an upcoming torch configuration feature that requires more careful ordering of app initialization steps.

Related Issues / Discussions

N/A

QA Instructions

  • Launch the app via invokeai-web.py and smoke test it.
  • Launch the app via the installer and smoke test it.
  • Test that generate_openapi_schema.py produces the same result before and after the change.
  • No regression in unit tests that directly interact with the app. (test_images.py)

Merge Plan

  • Check to see if there are any commercial implications to modifying the app entrypoint.

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • Documentation added / updated (if applicable)
  • Updated What's New copy (if doing a release after this PR)

@github-actions github-actions bot added the python PRs that change python files label Feb 21, 2025
@psychedelicious psychedelicious self-requested a review February 27, 2025 01:17
Copy link
Collaborator

@psychedelicious psychedelicious left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've improved custom node loading in #7698, moving that code to a function instead of running it implicitly as python loads modules.

Maybe should be re-organised per your other changes.

@RyanJDick RyanJDick merged commit 26730ca into main Feb 28, 2025
15 checks passed
@RyanJDick RyanJDick deleted the ryan/tidy-entry branch February 28, 2025 21:07
RyanJDick added a commit that referenced this pull request Feb 28, 2025
…7673)

## Summary

This PR adds a `pytorch_cuda_alloc_conf` config flag to control the
torch memory allocator behavior.

- `pytorch_cuda_alloc_conf` defaults to `None`, preserving the current
behavior.
- The configuration options are explained here:
https://pytorch.org/docs/stable/notes/cuda.html#optimizing-memory-usage-with-pytorch-cuda-alloc-conf.
Tuning this configuration can reduce peak reserved VRAM and improve
performance.
- Setting `pytorch_cuda_alloc_conf: "backend:cudaMallocAsync"` in
`invokeai.yaml` is expected to work well on many systems. This is a good
first step for those looking to tune this config. (We may make this the
default in the future.)
- The optimal configuration seems to be dependent on a number of factors
such as device version, VRAM, CUDA kernel version, etc. For now, users
will have to experiment with this config to see if it hurts or helps on
their systems. In most cases, I expect it to help.

### Memory Tests

```
VAE decode memory usage comparison:

- SDXL, fp16, 1024x1024:
  - `cudaMallocAsync`: allocated=2593 MB, reserved=3200 MB
  - `native`:          allocated=2595 MB, reserved=4418 MB

- SDXL, fp32, 1024x1024:
  - `cudaMallocAsync`: allocated=3982 MB, reserved=5536 MB
  - `native`:          allocated=3982 MB, reserved=7276 MB

- SDXL, fp32, 1536x1536:
  - `cudaMallocAsync`: allocated=8643 MB, reserved=12032 MB
  - `native`:          allocated=8643 MB, reserved=15900 MB
```

## Related Issues / Discussions

N/A

## QA Instructions

- [x] Performance tests with `pytorch_cuda_alloc_conf` unset.
- [x] Performance tests with `pytorch_cuda_alloc_conf:
"backend:cudaMallocAsync"`.

## Merge Plan

- [x] Merge #7668 first and change target branch to `main`

## Checklist

- [x] _The PR has a short but descriptive title, suitable for a
changelog_
- [x] _Tests added / updated (if applicable)_
- [x] _Documentation added / updated (if applicable)_
- [ ] _Updated `What's New` copy (if doing a release after this PR)_
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
python PRs that change python files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants