Edit model card

This repository contains quantized T5-large model. This model is 5x lesser than the T5-large model and also the inference time has been reduced by 3x on CPU, whilst giving impressive results.

Downloads last month
5
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.