How to use transfimer
#100 opened 10 months ago
by
sethdwumah
SFT is so BAD
#99 opened 10 months ago
by
GokhanAI
8bit quantization error
1
#98 opened 10 months ago
by
lovelyfrog
Key Error : Mixtral
8
#96 opened 10 months ago
by
jdjayakaran
Train the Model on Confluence
1
#95 opened 10 months ago
by
icemaro
Run Mistral model on Remote server
6
#94 opened 10 months ago
by
icemaro
Cuda Error
1
#93 opened 10 months ago
by
HuggySSO
Not supported with TGI
3
#92 opened 10 months ago
by
abhishek3jangid
deepspeed load mixtral-8x7B hang or oom
1
#91 opened 10 months ago
by
guowl
Add MOE (mixture of experts) tag
#90 opened 10 months ago
by
davanstrien
Update README.md
#89 opened 10 months ago
by
schuyler12
Failure in loading the model on AWS
8
#88 opened 10 months ago
by
bweinstein123
Hardware Requirements
6
#86 opened 10 months ago
by
ShivanshMathur007
Response content was truncated
19
#84 opened 10 months ago
by
ludomare
Best parameter setting for Mixtral model on the text-generation task
#83 opened 10 months ago
by
kmukeshreddy
Any hints on prompt to reduce / stop hallucinations
1
#82 opened 10 months ago
by
dnovak232
Still the best Mixtral based instruct model. We should change that
#81 opened 10 months ago
by
rombodawg
Could not convert to integer: 3221225477 error
#80 opened 10 months ago
by
KharabinDev42
Serving the model as API on vLLM and 2 x A6000
2
#78 opened 10 months ago
by
dnovak232
How much memory do I need for this model (on Windows)?
3
#77 opened 10 months ago
by
roboboot
Inconsistent prompt format. Which is correct the Model card or the tokenizer_config.json?
6
#75 opened 10 months ago
by
lemonflourorange
can not run sft full finetuning.
9
#74 opened 10 months ago
by
hegang126
[Chinese Version] Mixtral-8x7B model | 中文Mixtral-8x7B模型
#73 opened 11 months ago
by
wangrongsheng
Update the deprecated Flash Attention call parameter in from_pretrained() method
#72 opened 11 months ago
by
DeathReaper0965
can't load the model
2
#71 opened 11 months ago
by
JayZhang1
What is the best way for the inference process in LORA in PEFT approach
8
#70 opened 11 months ago
by
Pradeep1995
How to use system prompt?
1
#69 opened 11 months ago
by
mznw
Is there any simple way to solve the problem of redundant output
3
#68 opened 11 months ago
by
jjplane
Which is the actual way to store the adapters after PEFT finetuning
4
#67 opened 11 months ago
by
Pradeep1995
Failed to import transformers.models.mixtral.modeling_mixtral because of the following error (look up to see its traceback): libcudart.so.12: cannot open shared object file: No such file or directory
1
#66 opened 11 months ago
by
MukeshSharma
Model not loading, even with 4-bit quantization
1
#65 opened 11 months ago
by
soumodeep-semut
did Mixtral start from Mistral or from-scratch?
1
#64 opened 11 months ago
by
DaehanKim
How many GPUs do we need to run this out of box?
3
#63 opened 11 months ago
by
kz919
Is this model can choose expert for every token? Or just choose two expert for a input
#62 opened 11 months ago
by
PandaMaster
AutoTokenizer.from_pretrained show OSError
1
#61 opened 11 months ago
by
sean29
does file with .safetensors necessary for continue sft training?
#60 opened 11 months ago
by
hegang126
Incomplete Answers
7
#59 opened 11 months ago
by
samparksoftwares
How can we enable continuous learning with the LLM model ?
#58 opened 11 months ago
by
Tapendra
Inference generation extremely slow
6
#57 opened 11 months ago
by
aledane
Optimizing Mixtral-8x7B-Instruct-v0.1 for Hugging Face Chat
1
#54 opened 11 months ago
by
Husain
SageMaker Deployment Error
11
#53 opened 11 months ago
by
seabasshn
killed on Loading checkpoint shards
1
#52 opened 11 months ago
by
asmatveev
Playground?
1
#51 opened 11 months ago
by
pbourmeau
vectorstore
3
#50 opened 11 months ago
by
philgrey
Enable inference API
2
#49 opened 11 months ago
by
mrfakename
How to use consolidated.xx.pt?
1
#47 opened 11 months ago
by
Wan62
Model not loading and not printing any error message
2
#45 opened 11 months ago
by
robotrage
open weights???
2
#43 opened 11 months ago
by
alanchan808
Prompt Template for RAG
1
#42 opened 11 months ago
by
mox
there is no sliding_window in params.json
1
#41 opened 11 months ago
by
Moses25