HuggingFaceM4/idefics2-8b · CUDA out of memory

Apr 18

•

CUDA out of memory. Tried to allocate 12.78 GiB (GPU 0; 15.73 GiB total capacity; 11.21 GiB already allocated; 2.47 GiB free; 12.19 GiB reserved in total by PyTorch)

I have a cluster of GPU 4 GPU of 16GB,
GPU distribution (after model loading):
0: 9246/16300
1: 9246/16300
2: 9246/16300
3: 8038/16300

processor = AutoProcessor.from_pretrained("HuggingFaceM4/idefics2-8b", cache_dir = ".../.cache/huggingface/hub")
model = AutoModelForVision2Seq.from_pretrained("HuggingFaceM4/idefics2-8b", cache_dir = ".../.cache/huggingface/hub", device_map = "auto",, torch_dtype=torch.float16)

I changed this code so weight is distributed on GPU but

generated_ids = model.generate(**inputs, max_new_tokens=60)

GPU distribution (during running this code):
0: 12700/16300
1: 9246/16300
2: 9246/16300
3: 8038/16300

I am getting an error here: CUDA out of memory. Tried to allocate 12.78 GiB (GPU 0; 15.73 GiB total capacity; 11.21 GiB already allocated; 2.47 GiB free; 12.19 GiB reserved in total by PyTorch)

Thank you:)

karambos

Apr 18

•

edited Apr 18

i tried with AutoProcessor.from_pretrained with do_image_splitting=False
still getting same error

karambos

Apr 18

This comment has been hidden

karambos changed discussion status to closed Apr 18

karambos changed discussion status to open Apr 22

karambos