Does this model support the massive 32k context lenght (A bit rusty on my 2^k with k>13) as written on qwen2 github? Config file on non GUFF version tells 4096 tokens sliding window.
· Sign up or log in to comment