Teja-Gollapudi commited on
Commit
149b168
1 Parent(s): 998adce

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +56 -1
README.md CHANGED
@@ -6,4 +6,59 @@ language:
6
  - en
7
  library_name: transformers
8
  pipeline_tag: conversational
9
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  - en
7
  library_name: transformers
8
  pipeline_tag: conversational
9
+ ---
10
+
11
+ # VMware/open-llama-0.3T-7B-open-instruct-v1.1
12
+
13
+ Fully Open Source, <b>Commerically viable.</b>
14
+ The instruction dataset, [VMware/open-instruct-v1.1-oasst-dolly-hhrlhf](https://huggingface.co/datasets/VMware/open-instruct-v1.1-oasst-dolly-hhrlhf) is under cc-by-sa-3.0, and the Language Model ([openlm-research/open_llama_7b_preview_300bt](https://huggingface.co/openlm-research/open_llama_7b_preview_300bt/tree/main/open_llama_7b_preview_300bt_transformers_weights)) is under apache-2.0 License.
15
+
16
+ ## Useage
17
+
18
+ Please load the tokenizer with 'add_bos_token = True' parameter as the underlying OpenLLaMa model and this model were trained with a BOS token.
19
+
20
+ ```
21
+ import os
22
+ import torch
23
+ from transformers import AutoModelForCausalLM, AutoTokenizer
24
+
25
+ model_name = 'VMware/open-llama-0.3T-7B-open-instruct-v1.1'
26
+
27
+
28
+ tokenizer = AutoTokenizer.from_pretrained(model_name, add_bos_token = True)
29
+
30
+ model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype= torch.float16, device_map = 'sequential')
31
+
32
+ prompt_template = "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{instruction}\n\n### Response:"
33
+
34
+ prompt= 'Explain in simple terms how the attention mechanism of a transformer model works'
35
+
36
+
37
+ inputt = prompt_template.format(instruction= prompt)
38
+ input_ids = tokenizer(inputt, return_tensors="pt").input_ids.to("cuda")
39
+
40
+ output1 = model.generate(input_ids, max_length=512)
41
+ input_length = input_ids.shape[1]
42
+ output1 = output1[:, input_length:]
43
+ output= tokenizer.decode(output1[0])
44
+
45
+ print(output)
46
+
47
+ '''
48
+ The attention mechanism of a transformer model is designed to help the model understand the relationship between different parts of a sentence.
49
+ The model uses a weighted attention score to determine how much each input token contributes to the output.
50
+ The attention score is calculated by looking at the similarity between each input token and the output token,and assigning a weight to each input token based on this similarity.
51
+ This way, the model can better understand the relationship between different parts of a sentence and generate more accurate predictions.
52
+
53
+ '''
54
+ ```
55
+
56
+ ## Drawbacks
57
+ <ul>
58
+ <li>The model was trained on a partially trained Open-LLaMA checkpoint. (300B tokens).
59
+ </li>The model is inconsistent with outputting '\n' tokens as majority of the dataset is obtained from [mosaicml/dolly_hhrlhf](https://huggingface.co/datasets/mosaicml/dolly_hhrlhf) and that dataset removed newline characters from responses.
60
+ </ul>
61
+
62
+ ## Evaluation
63
+
64
+ <B>TODO</B>