Model details
#3
by
deathcrush
- opened
Hi @philschmid ,
I experimented with this https://huggingface.co/spaces/osanseviero/i-like-flan/tree/main demo and found that I could use FLAN-T5-XXL for very interesting research. I am looking to understand a bit more what model I was running and how it relates to the google checkpoint from the hub. I can see you used Artifact-AI/flan-t5-xxl-sharded-fp16. What exactly is this model? Did the authors just cast the model weights to fp16 and then saved the model to reduce inference latency and cost?