Replies: 3 comments
-
|
Beta Was this translation helpful? Give feedback.
-
I'm trying to replicate on a small example but I'm failing to reproduce. Can you share more about the environment you're working in ? Python version, all libs versions, OS, etc,, ? Just a a quick side you could try;
This is what should happen under the hood, and there zero issue even if the memory mapping exceeds (by far) the available RAM. |
Beta Was this translation helpful? Give feedback.
-
Wow, rechecked again because I had 2 reports, it seems it's
This forces torch to actually allocate the entire file. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
My VM has 32GB RAM and 2 x Nvidia Tesla V100 32GB.
I'm not able to load a safetensors file larger than my 32GB system RAM even if I have 64GB VRAM available. It seems like the memory is allocated in system RAM and afterwards the model is loaded on my GPUs if I use a smaler model.
Is there a way to directly allocate the memory on the GPUs? I want to load safetensors >32Gb and < 64GB
Beta Was this translation helpful? Give feedback.
All reactions