adding note about the LlamaTokenizerFast is not included in this build so the Inference API will not work. please use the LlamaTokenizerFast from: Doctor-Shotgun/TinyLlama-1.1B-32k-Instruct to use this model at this time to the README
initial release for the tinyllama merged model with 32k context: TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T + Doctor-Shotgun/TinyLlama-1.1B-32k-Instruct + Doctor-Shotgun/TinyLlama-1.1B-32k + Tensoic/TinyLlama-1.1B-3T-openhermes + Josephgflowers/TinyLlama-3T-Cinder-v1.3