-
Notifications
You must be signed in to change notification settings - Fork 29.5k
[bnb] Minor modifications #18631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
younesbelkada
merged 20 commits into
huggingface:main
from
younesbelkada:bnb-fix-small-details
Aug 16, 2022
Merged
[bnb] Minor modifications #18631
Changes from all commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
13c266f
bnb minor modifications
younesbelkada ab9f9d8
Apply suggestions from code review
younesbelkada fdabb5d
Apply suggestions from code review
younesbelkada 5994e83
Apply suggestions from code review
younesbelkada ea119e3
Apply suggestions from code review
younesbelkada 0b17a89
put in one block
younesbelkada 632a5a7
update readme
younesbelkada 8e70e5f
change text a bit
younesbelkada 2e8f26b
Apply suggestions from code review
younesbelkada c8d6281
apply suggestions
younesbelkada 8e7eb7e
add link to paper
younesbelkada a4d0d1b
Apply suggestions from code review
younesbelkada 27792f2
Update tests/mixed_int8/README.md
younesbelkada 0016acf
Apply suggestions from code review
younesbelkada 65ec377
refactor a bit
younesbelkada e950226
add instructions Turing & Amperer
younesbelkada 82a9c8e
add A6000
younesbelkada 8e703b0
clarify a bit
younesbelkada e16a56a
remove small part
younesbelkada 2fef415
Update tests/mixed_int8/README.md
younesbelkada File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,37 +1,120 @@ | ||
# Testing mixed int8 quantization | ||
|
||
 | ||
|
||
The following is the recipe on how to effectively debug `bitsandbytes` integration on Hugging Face `transformers`. | ||
|
||
## Library requirements | ||
|
||
+ `transformers>=4.22.0` | ||
+ `accelerate>=0.12.0` | ||
+ `bitsandbytes>=0.31.5`. | ||
## Hardware requirements | ||
|
||
I am using a setup of 2 GPUs that are NVIDIA-Tesla T4 15GB | ||
The following instructions are tested with 2 NVIDIA-Tesla T4 GPUs. To run successfully `bitsandbytes` you would need a 8-bit core tensor supported GPU. Note that Turing, Ampere or newer architectures - e.g. T4, RTX20s RTX30s, A40-A100, A6000 should be supported. | ||
|
||
## Virutal envs | ||
|
||
```conda create --name int8-testing python==3.8``` | ||
```git clone https://github.com/younesbelkada/transformers.git && git checkout integration-8bit``` | ||
```pip install -e ".[dev]"``` | ||
```pip install -i https://test.pypi.org/simple/ bitsandbytes``` | ||
```pip install git+https://github.com/huggingface/accelerate.git@e0212893ea6098cc0a7a3c7a6eb286a9104214c1``` | ||
```bash | ||
conda create --name int8-testing python==3.8 | ||
pip install bitsandbytes>=0.31.5 | ||
pip install accelerate>=0.12.0 | ||
pip install transformers>=4.23.0 | ||
``` | ||
if `transformers>=4.23.0` is not released yet, then use: | ||
``` | ||
pip install git+https://github.com/huggingface/transformers.git | ||
``` | ||
|
||
## Troubleshooting | ||
|
||
A list of common errors: | ||
|
||
## Trobleshooting | ||
### Torch does not correctly do the operations on GPU | ||
|
||
```conda create --name int8-testing python==3.8``` | ||
```pip install -i https://test.pypi.org/simple/ bitsandbytes``` | ||
```conda install pytorch torchvision torchaudio -c pytorch``` | ||
```git clone https://github.com/younesbelkada/transformers.git && git checkout integration-8bit``` | ||
```pip install -e ".[dev]"``` | ||
```pip install git+https://github.com/huggingface/accelerate.git@b52b793ea8bac108ba61192eead3cf11ca02433d``` | ||
First check that: | ||
|
||
### Check driver settings: | ||
```py | ||
import torch | ||
|
||
vec = torch.randn(1, 2, 3).to(0) | ||
``` | ||
nvcc --version | ||
|
||
Works without any error. If not, install torch using `conda` like: | ||
|
||
```bash | ||
conda create --name int8-testing python==3.8 | ||
conda install pytorch torchvision torchaudio cudatoolkit=11.6 -c pytorch -c conda-forge | ||
pip install bitsandbytes>=0.31.5 | ||
pip install accelerate>=0.12.0 | ||
pip install transformers>=4.23.0 | ||
``` | ||
For the latest pytorch instructions please see [this](https://pytorch.org/get-started/locally/) | ||
|
||
and the snippet above should work. | ||
|
||
### ` bitsandbytes operations are not supported under CPU!` | ||
|
||
This happens when some Linear weights are set to the CPU when using `accelerate`. Please check carefully `model.hf_device_map` and make sure that there is no `Linear` module that is assigned to CPU. It is fine to have the last module (usually the Lm_head) set on CPU. | ||
|
||
### `To use the type as a Parameter, please correct the detach() semantics defined by __torch_dispatch__() implementation.` | ||
|
||
Use the latest version of `accelerate` with a command such as: `pip install -U accelerate` and the problem should be solved. | ||
|
||
### `Parameter has no attribue .CB` | ||
|
||
Same solution as above. | ||
|
||
### `RuntimeError: CUDA error: an illegal memory access was encountered ... consider passing CUDA_LAUNCH_BLOCKING=1` | ||
|
||
Run your script by pre-pending `CUDA_LAUNCH_BLOCKING=1` and you should observe an error as described in the next section. | ||
|
||
### `CUDA illegal memory error: an illegal memory access at line...`: | ||
younesbelkada marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Check the CUDA verisons with: | ||
``` | ||
nvcc --version | ||
``` | ||
and confirm it is the same version as the one detected by `bitsandbytes`. If not, run: | ||
``` | ||
ls -l $CONDA_PREFIX/lib/libcudart.so | ||
``` | ||
or | ||
``` | ||
ls -l $LD_LIBRARY_PATH | ||
``` | ||
Check if `libcudart.so` has a correct symlink that is set. Sometimes `nvcc` detects the correct CUDA version but `bitsandbytes` doesn't. You have to make sure that the symlink that is set for the file `libcudart.so` is redirected to the correct CUDA file. | ||
|
||
Here is an example of a badly configured CUDA installation: | ||
|
||
`nvcc --version` gives: | ||
|
||
 | ||
|
||
which means that the detected CUDA version is 11.3 but `bitsandbytes` outputs: | ||
|
||
 | ||
|
||
First check: | ||
|
||
```bash | ||
echo $LD_LIBRARY_PATH | ||
``` | ||
|
||
If this contains multiple paths separated by `:`. Then you have to make sure that the correct CUDA version is set. By doing: | ||
|
||
```bash | ||
ls -l $path/libcudart.so | ||
``` | ||
|
||
On each path (`$path`) separated by `:`. | ||
If not, simply run | ||
```bash | ||
ls -l $LD_LIBRARY_PATH/libcudart.so | ||
``` | ||
|
||
and you can see | ||
|
||
### Recurrent bugs | ||
 | ||
|
||
Sometimes you have to run a "dummy" inference pass when dealing with a multi-GPU setup. Checkout the ```test_multi_gpu_loading``` and the ```test_pipeline``` functions. | ||
If you see that the file is linked to the wrong CUDA version (here 10.2), find the correct location for `libcudart.so` (`find --name libcudart.so`) and replace the environment variable `LD_LIBRARY_PATH` with the one containing the correct `libcudart.so` file. |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.