|
--- |
|
library_name: transformers |
|
tags: |
|
- hqq |
|
--- |
|
|
|
*There currently is an issue with the **model generating random reserved special tokens (like "<|reserved_special_token_49|>") at the end**. Please use with `skip_special_tokens=true`. We will update once we found the reason for this behaviour. If you found a solution, please let us know!* |
|
|
|
# Llama 3 DiscoLM German 8b v0.1 Experimental |
|
|
|
<p align="center"><img src="disco_llama.webp" width="400"></p> |
|
|
|
# Introduction |
|
|
|
**Llama 3 DiscoLM German 8b v0.1 Experimental** is an experimental Llama 3 based version of [DiscoLM German](https://huggingface.co/DiscoResearch/DiscoLM_German_7b_v1). |
|
|
|
This is an experimental release and not intended for production use. The model is still in development and will be updated with new features and improvements in the future. |
|
|
|
Please find a online Demo [here](https://364b61f772fa7baacb.gradio.live/) (we may take this offline for updates). |
|
|
|
# Prompt Format |
|
|
|
DiscoLM German uses ChatML as the prompt format which enables OpenAI endpoint compatability and is supported by most inference libraries and frontends. |
|
|
|
System prompts allow steerability and interesting new ways to interact with an LLM, guiding rules, roles, and stylistic choices of the model. |
|
|
|
``` |
|
<|im_start|>system |
|
Du bist ein hilfreicher Assistent.<|im_end|> |
|
<|im_start|>user |
|
Wer bist du?<|im_end|> |
|
<|im_start|>assistant |
|
Ich bin ein Sprachmodell namens DiscoLM German und ich wurde von DiscoResearch trainiert.<|im_end|> |
|
``` |
|
|
|
This prompt is available as a [chat template](https://huggingface.co/docs/transformers/main/chat_templating), which means you can format messages using the |
|
`tokenizer.apply_chat_template()` method: |
|
|
|
```python |
|
messages = [ |
|
{"role": "system", "content": "Du bist ein hilfreicher Assistent."}, |
|
{"role": "user", "content": "Wer bist du?"} |
|
] |
|
gen_input = tokenizer.apply_chat_template(message, return_tensors="pt") |
|
model.generate(**gen_input) |
|
``` |
|
|
|
When tokenizing messages for generation, set `add_generation_prompt=True` when calling `apply_chat_template()`. This will append `<|im_start|>assistant\n` to your prompt, to ensure |
|
that the model continues with an assistant response. |
|
|
|
# Example Code for Inference |
|
|
|
```python |
|
model_id = "DiscoResearch/Llama3_DiscoLM_German_8b_v0.1_experimental" |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_id, |
|
torch_dtype=torch.bfloat16, |
|
device_map="auto", |
|
) |
|
|
|
messages = [ |
|
{"role": "system", "content": "Du bist ein hilfreicher Assistent."}, |
|
{"role": "user", "content": "Wer bist du?"}, |
|
] |
|
|
|
input_ids = tokenizer.apply_chat_template( |
|
messages, |
|
add_generation_prompt=True, |
|
return_tensors="pt" |
|
).to(model.device) |
|
|
|
terminators = [ |
|
tokenizer.eos_token_id, |
|
tokenizer.convert_tokens_to_ids("<|eot_id|>") |
|
] |
|
|
|
outputs = model.generate( |
|
input_ids, |
|
max_new_tokens=256, |
|
eos_token_id=terminators, |
|
do_sample=True, |
|
temperature=0.6, |
|
top_p=0.9, |
|
) |
|
response = outputs[0][input_ids.shape[-1]:] |
|
print(tokenizer.decode(response, skip_special_tokens=True)) |
|
``` |
|
|
|
|
|
# Limitations & Biases |
|
|
|
This model can produce factually incorrect and offensive output, and should not be relied on to produce factually accurate information. |
|
This model was trained on various public datasets. While great efforts have been taken to clean the pretraining data, it is possible that this model could generate biased or otherwise offensive outputs and it is the responsibility of the user to implement a safety/moderation layer. Please use with caution. |
|
|
|
# License |
|
|
|
This model is distributed under the META LLAMA 3 COMMUNITY LICENSE, see [LICENSE](LICENSE) for more information. |
|
|
|
# Acknowledgements |
|
|
|
Built with Meta Llama 3. |
|
|
|
DiscoLM German is a [DiscoResearch](https://huggingface.co/DiscoResearch) project, a collective effort by [JP Harries](https://huggingface.co/jphme), [Björn Plüster](https://huggingface.co/bjoernp) and [Daniel Auras](https://huggingface.co/rasdani). |
|
|
|
Development of Llama 3 DiscoLM German 8b was sponsored by [ellamind](https://ellamind.com). |
|
Compute was sponsored generously by [sysGen GmbH](https://www.sysgen.de/). |
|
|
|
[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl) |
|
|
|
|
|
# About DiscoResearch |
|
|
|
DiscoResearch is an aspiring open research community for AI enthusiasts and LLM hackers. Come join our [Discord](https://discord.gg/ttNdas89f3), share your opinions and ideas, and advance open LLM research with us! |
|
|
|
|
|
# Disclaimer |
|
|
|
The license on this model does not constitute legal advice. We are not responsible for the actions of third parties who use this model. This model should only be deployed with additional safety measures in place. |
|
|