-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Hello,
I would appreciate any advice/tips on this.
A common use-case these days is to use speech/vision embeddings from model A and join them with another model B. When these models and datasets are big, pre-computing and writing them to disk is very slow. Currently i do it in shards by pre-computing embeddings offline, save it in dataset and later use them in my model training.
However, another alternative is to use streaming approach. You can generate the embeddings during preprocess or in the data collator. However, i keep getting the error that the input_ids and weights are on cpu and cuda:0. When i move the inputs to device, i get the following error.
Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
Now how can one do GPU computation in the pre-process? I need to get the embeddings from both model A and model B to prepare my final input_embeds
batch.
The only approach i can see is to do this inside the model code but would appreciate if they have any other tips/advice.
Thank you!!