Can not use HF transformers for inference?
#11
by
haili-tian
- opened
In your model card, the recommended inference engines are vLLM and mistral-inference, and HF transformers is not included.
Is it to say, the HF transformer can not be used to infer this model and at least can not fully demonstrated its features?
I.g., for interleaved sliding-window attention, NO impl. can be found in latest transformers.