HachiML commited on
Commit
9e18fc7
1 Parent(s): fc8edc9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -4
README.md CHANGED
@@ -11,13 +11,53 @@ should probably proofread and complete it, then remove this comment. -->
11
 
12
  # myBit-Llama2-jp-127M-4
13
 
14
- This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
15
- It achieves the following results on the evaluation set:
 
16
  - Loss: 2.9790
17
 
18
  ## Model description
19
 
20
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
  ## Intended uses & limitations
23
 
@@ -25,7 +65,7 @@ More information needed
25
 
26
  ## Training and evaluation data
27
 
28
- More information needed
29
 
30
  ## Training procedure
31
 
 
11
 
12
  # myBit-Llama2-jp-127M-4
13
 
14
+ This model has 127M parameters.
15
+ The model is a pre-trained Bit-Llama2 of Parameters with only 1 epoch on a Japanese dataset.
16
+ The dataset used is [range3/wiki40b-ja](https://huggingface.co/datasets/range3/wiki40b-ja).
17
  - Loss: 2.9790
18
 
19
  ## Model description
20
 
21
+ Github: [BitNet-b158](https://github.com/Hajime-Y/BitNet-b158)
22
+ More information about this model can be found in the following pages:
23
+ [BitNet&BitNet b158の実装](https://note.com/hatti8/n/nc6890e79a19a)
24
+
25
+ ## How to use
26
+
27
+ 1. install the library
28
+ ```
29
+ !pip install mybitnet
30
+ !pip install -U accelerate transformers==4.38.2
31
+ !pip install torch
32
+ ```
33
+
34
+ 2. get model
35
+ ```
36
+ import torch
37
+ from transformers import AutoTokenizer, AutoModelForCausalLM
38
+
39
+ model_name = "HachiML/myBit-Llama2-jp-127M-4"
40
+
41
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
42
+ model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True)
43
+ print(model)
44
+ ```
45
+
46
+ 3. inference
47
+ ```
48
+ prompt = "昔々あるところに、"
49
+ input_ids = tokenizer.encode(
50
+ prompt,
51
+ return_tensors="pt"
52
+ )
53
+ tokens = model.generate(
54
+ input_ids.to(device=model.device),
55
+ max_new_tokens=128,
56
+ )
57
+
58
+ out = tokenizer.decode(tokens[0], skip_special_tokens=True)
59
+ print(out)
60
+ ```
61
 
62
  ## Intended uses & limitations
63
 
 
65
 
66
  ## Training and evaluation data
67
 
68
+ - [range3/wiki40b-ja](https://huggingface.co/datasets/range3/wiki40b-ja)
69
 
70
  ## Training procedure
71