README.md · appvoid/palmer-003 at d51b7472e601399437ef15724f75c58be927d426

metadata

license: apache-2.0
language:
  - en
  - es
  - fr
tags:
  - merge

Creative writing has never been so accesible, palmer goes beyond what it was thought about small language models. This model is a "MErging of Experts" (MEoE) using an internal model palmer-003-2401 as base, biased as an assistant, using dpo technique, without using any prompts—as a result of these efforts—palmer is better than most 1b language models on most benchmarks, despite being sometimes 40% smaller than its counterparts.

  Model      MMLU     ARC-C    OBQA   HellaSwag  PIQA  Winogrande Average  Params
tinyllama | 0.2577 | 0.3029 | 0.3600 | 0.5935 | 0.7329 | 0.5959 | 0.4738 | 1.1B  |
zyte 	  | 0.2397 | 0.3353 | 0.3700 | 0.6086 | 0.7541 | 0.5998 | 0.4845 | 1.1B  |
palmer    | 0.2523 | 0.3439 | 0.3740 | 0.6208 | 0.7524 | 0.6590 | 0.5004 | 1.1B  |
qwen      | 0.4536 | 0.3490 | 0.3320 | 0.5876 | 0.7307 | 0.5896 | 0.5070 | 1.8B  |

This work constitutes, given its compactness, an advancement towards SMLs, easily empowering edge devices such as mobile phones, raspberry pis and automated software/robots. Aditionally, palmer-003 follows the same philosophy as palmer-002.5 to become a more powerful model with more data instead of less.

prompt: Never give up.
output: Keep pushing forward. Remember, you are not alone.
You have the strength and the support of your family and friends.
As you continue on your journey, remember to take care of yourself.
Eat well, exercise regularly, and get enough sleep.
Stay positive and focused on your goals.
Remember, you are capable of achieving anything you set your mind to.

You can support me through kofi.

Important

Keep in mind that if you quantize the fp16 gguf model to let's say q8 or q6 there is gonna be a extremely huge performance drop. So, if you want the most of this model, use it as it is, in fp16.