flash_attn package makes it non-portable

by bghira - opened 9 days ago

Discussion

bghira

9 days ago

only runs on NVIDIA systems. not Apple, or AMD.

zoldaten

about 22 hours ago

try to avoid :

model = AutoModelForCausalLM.from_pretrained(
EMU_HUB,
device_map="cuda:0",
torch_dtype=torch.bfloat16,
#attn_implementation="flash_attention_2",
trust_remote_code=True,
)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment