vladfaust/Phi-3-mini-4k-instruct-Q4_K_M-GGUF · Model only produces nonsense

Hi, I tried this model with llama-server llama-server --hf-repo vladfaust/Phi-3-mini-4k-instruct-Q4_K_M-GGUF --hf-file phi-3-mini-4k-instruct-q4_k_m.gguf -c 2048

Using the llama.cpp web UI, this is what I got:

user: Hi, who are you?

assistant: annteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannteannte

(it would go on until I hit Stop)

I'm using this model as a replacement for the original one from Microsoft, which now after some recent patch to GGUF gives

llama_model_load: error loading model: error loading model hyperparameters: key not found in model: phi3.attention.sliding_window

Do you know what could be wrong?