File size: 2,323 Bytes
fb2e0e1 3670fa0 fb2e0e1 3670fa0 fb2e0e1 3670fa0 b4ac2ab 3670fa0 e0991c5 a6d6f05 e0991c5 4aeaf54 3d4e8d1 8fcc675 4aeaf54 e0991c5 4aeaf54 f758e10 4aeaf54 3670fa0 4aeaf54 3670fa0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
---
language: de
license: mit
tags:
- pytorch
- causal-lm
datasets:
- c4
---
# Cedille AI
Cedille is a project to bring large language models to non-English languages.
## de-anna
Anna is a 6B parameter autoregressive language model based on the GPT-J architecture and trained using the [mesh-transformer-jax](https://github.com/kingoflolz/mesh-transformer-jax) codebase.
Anna was trained on German text with a similar methodology to [Boris](https://huggingface.co/Cedille/fr-boris), our French model. We started training from GPT-J, which has been trained on [The Pile](https://pile.eleuther.ai/). As a consequence the model still has good performance in English language. Anna makes use of the unmodified GPT-2 tokenizer.
# How to run
## Loading the model
### Base (requires 48+ GB of RAM)
```
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Cedille/de-anna")
model = AutoModelForCausalLM.from_pretrained("Cedille/de-anna")
```
### Lower memory usage
Loading a model with Huggingface requires two copies of the weights, so 48+ GB of RAM for [GPT-J models](https://huggingface.co/docs/transformers/v4.15.0/model_doc/gptj) in float32 precision.
The first trick would be to load the model with the specific argument below to load only one copy of the weights.
```
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("Cedille/de-anna")
model = AutoModelForCausalLM.from_pretrained("Cedille/de-anna", low_cup_mem_usage=True)
```
We are planning on adding an fp16 branch soon. Combined with the lower memory loading above, loading could be done on 12.1GB of RAM.
## Generation example
```
model.eval()
input_sentence = "Wo hast du unsere Sprache gelernt?"
input_ids = tokenizer.encode(input_sentence, return_tensors='pt')
beam_outputs = model.generate(
input_ids,
max_length=100,
do_sample=True,
top_k=50,
top_p=0.95,
num_return_sequences=1
)
print(tokenizer.decode(beam_outputs[0], skip_special_tokens=True))
```
## Contact us
For any custom development please contact us at [email protected].
## Links
* [Official website](https://en.cedille.ai/)
* [Blog](https://en.cedille.ai/blog)
* [GitHub](https://github.com/coteries/cedille-ai)
* [Twitter](https://twitter.com/CedilleAI)
|