Hugging Face
Models
Datasets
Spaces
Posts
Docs
Solutions
Pricing
Log In
Sign Up
microsoft
/
Phi-3-small-8k-instruct
like
158
Follow
Microsoft
4,847
Text Generation
Transformers
Safetensors
multilingual
phi3small
nlp
code
conversational
custom_code
License:
mit
Model card
Files
Files and versions
Community
31
Train
Use this model
Why the inference speed so slow compare with same 7B parameters of Qwen?
#26
by
lucasjin
- opened
Jul 4
Discussion
lucasjin
Jul 4
It's slower about 30% from my sense when chat on same GPU A100.
See translation
Edit
Preview
Upload images, audio, and videos by dragging in the text input, pasting, or
clicking here
.
Tap or paste here to upload images
Comment
·
Sign up
or
log in
to comment