Is it possible to make this awesome model with 128K context?

#11
by shoxik - opened

Could you please provide also 128K version of this model, if I can ask you to try it also for 1.5B, these models are hiding true potential.

Thank you for great work!

Sign up or log in to comment