Quantization to INT4 for training on COLAB A100 GPU with 40GB of VRAM
LORA for parameter-efficient-fine-tuning which allowed attaching an adapter that was customized for specific task.
Observations:
Initial model does not have enough of predictive power to distinguish each entry that is passed during inference
Adapters indeed adapt the model for specific tasks, which was evident, when model changed its predictions towards the majority-class instead of random prediction during inference.
Requirement is easy, adapt the model and passed data to create some predictive power.