|
--- |
|
library_name: transformers |
|
license: gemma |
|
language: |
|
- en |
|
base_model: |
|
- google/gemma-2-9b-it |
|
--- |
|
|
|
Yippie!! It's my birthday in 2 days! So I'm gonna drop this model that I made at 2AM. |
|
|
|
It's quite a bit different from what I've been doing recently, but it was pretty fun to work on. :D |
|
|
|
Used a different merging technique that I quickly designed & crafted which focused on the math and logical reasoning. |
|
|
|
Also fine-tuned this on Self-Play RL algorithms. |
|
|
|
Not the best LLM out there but it was pretty fun making this, and coming up with something different. |
|
|
|
Borrowed some ideas from my other "distillation" technique, which significantly reduces the number of layers while aiming to retain the output quality. |
|
|
|
Thanks and have fun!! |
|
|
|
Also happy birthday to myself :) |
|
|
|
Usage (taken from official Gemma 2 9B repo): |
|
```python |
|
import torch |
|
from transformers import pipeline |
|
|
|
pipe = pipeline( |
|
"text-generation", |
|
model="minchyeom/birthday-llm", |
|
model_kwargs={"torch_dtype": torch.bfloat16}, |
|
device="cuda", # replace with "mps" to run on a Mac device |
|
) |
|
|
|
messages = [ |
|
{"role": "user", "content": "How many r's are there in the word strawberry?"}, |
|
] |
|
|
|
outputs = pipe(messages, max_new_tokens=256) |
|
assistant_response = outputs[0]["generated_text"][-1]["content"].strip() |
|
print(assistant_response) |
|
``` |