Pardon my question – you were maybe one day too early?

#2
by MarcGrumpyOlejak - opened

Hi, I really like the idea of mixing different models and after some short tests with german texts the answers "feel" a bit more natural than pure DiscoLM – but I think you might have been one day too early as jphme from DiscoResearch bugfixed the EOS token (today is 30.01.2024) exactly 7 days ago – https://huggingface.co/DiscoResearch/DiscoLM_German_7b_v1/commit/560f972f9f735fc9289584b3aa8d75d0e539c44e

Your model (and the GGUF-version I tested) is 8 days old – and my tests show, that the original DiscoLM is a bit like it dropped in some LSD on the dancefloor as it won't stop to produce many slightly different but correct answers to one and the same question or just repeating that it is only a model and one should ask a human being for further information (and many other things) – so the EOS-token seems to be the wrong one :D

Owner

I'm late, sorry. Thanks to pointing me on that, but because I didn't used DiscoLM as base model for slerp, or better said the tokenizer from Kunoichi, it is all fine with this and the other merge so far. So if there is an issue with the model it is the merge itself and doesn't came from this EOS mistake from DiscoLM. At least from what I have understood as I asked for that problem myself.

Sign up or log in to comment