shanearora commited on
Commit
16fd943
1 Parent(s): 7d31d70

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -26
README.md CHANGED
@@ -9,9 +9,9 @@ language:
9
 
10
  <img src="https://allenai.org/olmo/olmo-7b-animation.gif" alt="OLMo Logo" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
11
 
12
- # Model Card for OLMo 1.7-7B-hf
13
 
14
- OLMo 1.7 7B is the latest version of the original [OLMo 7B](https://huggingface.co/allenai/OLMo-7B) model rocking a 24 point increase in MMLU, among other evaluations improvements, from an improved version of the Dolma dataset and staged training.
15
  **This version is for direct use with HuggingFace Transformers** from v4.40 on.
16
 
17
  OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
@@ -26,27 +26,22 @@ The core models released in this batch are the following:
26
  | [OLMo 1B](https://huggingface.co/allenai/OLMo-1B) | 3 Trillion |16 | 2048 | 16 | 2048 |
27
  | [OLMo 7B](https://huggingface.co/allenai/OLMo-7B) | 2.5 Trillion | 32 | 4096 | 32 | 2048 |
28
  | [OLMo 7B Twin 2T](https://huggingface.co/allenai/OLMo-7B-Twin-2T) | 2 Trillion | 32 | 4096 | 32 | 2048 |
29
- | [OLMo 1.7-7B](https://huggingface.co/allenai/OLMo-1.7-7B) | 2.05 Trillion | 32 | 4096 | 32 | 4096 |
30
 
31
- *Note: OLMo 1.7-7B also includes QKV clipping.*
32
-
33
-
34
- [Coming soon] We are releasing many checkpoints for these models, for every 1000 training steps.
35
- The naming convention is `step1000-tokens4B`.
36
 
37
  To load a specific model revision with HuggingFace, simply add the argument `revision`:
38
  ```bash
39
- olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-1.7-7B-hf", revision="step1000-tokens4B")
40
  ```
41
 
42
  All revisions/branches are listed in the file `revisions.txt`.
43
  Or, you can access all the revisions for the models via the following code snippet:
44
  ```python
45
  from huggingface_hub import list_repo_refs
46
- out = list_repo_refs("allenai/OLMo-1.7-7B-hf")
47
  branches = [b.name for b in out.branches]
48
  ```
49
- A few revisions were lost due to an error, but the vast majority are present.
50
 
51
  ### Model Description
52
 
@@ -75,13 +70,11 @@ A few revisions were lost due to an error, but the vast majority are present.
75
 
76
  ### Inference
77
 
78
- Install Transformers [from source](https://huggingface.co/docs/transformers/en/installation#install-from-source), or update to the next version when this [PR](https://github.com/huggingface/transformers/pull/29890) is integrated.
79
-
80
- Now, proceed as usual with HuggingFace:
81
  ```python
82
  from transformers import AutoModelForCausalLM, AutoTokenizer
83
- olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-1.7-7B-hf")
84
- tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo-1.7-7B-hf")
85
  message = ["Language modeling is "]
86
  inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
87
  # optional verifying cuda
@@ -94,20 +87,14 @@ print(tokenizer.batch_decode(response, skip_special_tokens=True)[0])
94
  Alternatively, with the pipeline abstraction:
95
  ```python
96
  from transformers import pipeline
97
- olmo_pipe = pipeline("text-generation", model="allenai/OLMo-1.7-7B-hf")
98
  print(olmo_pipe("Language modeling is "))
99
  >> 'Language modeling is a branch of natural language processing that aims to...'
100
  ```
101
 
102
- Or, you can make this slightly faster by quantizing the model, e.g. `AutoModelForCausalLM.from_pretrained("allenai/OLMo-1.7-7B-hf", torch_dtype=torch.float16, load_in_8bit=True)` (requires `bitsandbytes`).
103
  The quantized model is more sensitive to typing / cuda, so it is recommended to pass the inputs as `inputs.input_ids.to('cuda')` to avoid potential issues.
104
 
105
- Note, you may see the following error if `ai2-olmo` is not installed correctly, which is caused by internal Python check naming. We'll update the code soon to make this error clearer.
106
- ```bash
107
- raise ImportError(
108
- ImportError: This modeling file requires the following packages that were not found in your environment: hf_olmo. Run `pip install hf_olmo`
109
- ```
110
-
111
  ### Fine-tuning
112
  Model fine-tuning can be done from the final checkpoint (the `main` revision of this model) or many intermediate checkpoints. Two recipes for tuning are available.
113
  1. Fine-tune with the OLMo repository:
@@ -225,7 +212,7 @@ Optimizer settings comparison with peer models.
225
 
226
 
227
 
228
- ## Environmental Impact
229
 
230
  OLMo 7B variants were either trained on MI250X GPUs at the LUMI supercomputer, or A100-40GB GPUs provided by MosaicML.
231
  A summary of the environmental impact. Further details are available in the paper.
@@ -233,7 +220,7 @@ A summary of the environmental impact. Further details are available in the pape
233
  | | GPU Type | Power Consumption From GPUs | Carbon Intensity (kg CO₂e/KWh) | Carbon Emissions (tCO₂eq) |
234
  |-----------|------------|-----------------------------|--------------------------------|---------------------------|
235
  | OLMo 7B Twin | MI250X ([LUMI supercomputer](https://www.lumi-supercomputer.eu)) | 135 MWh | 0* | 0* |
236
- | OLMo 7B | A100-40GB ([MosaicML](https://www.mosaicml.com)) | 104 MWh | 0.656 | 75.05 |
237
 
238
  ## Bias, Risks, and Limitations
239
 
 
9
 
10
  <img src="https://allenai.org/olmo/olmo-7b-animation.gif" alt="OLMo Logo" width="800" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
11
 
12
+ # Model Card for OLMo 7B April 2024
13
 
14
+ OLMo 7B April 2024 is an updated version of the original [OLMo 7B](https://huggingface.co/allenai/OLMo-7B) model rocking a 24 point increase in MMLU, among other evaluations improvements, from an improved version of the Dolma dataset and staged training.
15
  **This version is for direct use with HuggingFace Transformers** from v4.40 on.
16
 
17
  OLMo is a series of **O**pen **L**anguage **Mo**dels designed to enable the science of language models.
 
26
  | [OLMo 1B](https://huggingface.co/allenai/OLMo-1B) | 3 Trillion |16 | 2048 | 16 | 2048 |
27
  | [OLMo 7B](https://huggingface.co/allenai/OLMo-7B) | 2.5 Trillion | 32 | 4096 | 32 | 2048 |
28
  | [OLMo 7B Twin 2T](https://huggingface.co/allenai/OLMo-7B-Twin-2T) | 2 Trillion | 32 | 4096 | 32 | 2048 |
29
+ | [OLMo 7B April 2024](https://huggingface.co/allenai/OLMo-7B-0424-hf) | 2.05 Trillion | 32 | 4096 | 32 | 4096 |
30
 
31
+ *Note: OLMo 7B April 2024 also includes QKV clipping.*
 
 
 
 
32
 
33
  To load a specific model revision with HuggingFace, simply add the argument `revision`:
34
  ```bash
35
+ olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-7B-0424-hf", revision="step1000-tokens4B")
36
  ```
37
 
38
  All revisions/branches are listed in the file `revisions.txt`.
39
  Or, you can access all the revisions for the models via the following code snippet:
40
  ```python
41
  from huggingface_hub import list_repo_refs
42
+ out = list_repo_refs("allenai/OLMo-7B-0424-hf")
43
  branches = [b.name for b in out.branches]
44
  ```
 
45
 
46
  ### Model Description
47
 
 
70
 
71
  ### Inference
72
 
73
+ Proceed as usual with HuggingFace:
 
 
74
  ```python
75
  from transformers import AutoModelForCausalLM, AutoTokenizer
76
+ olmo = AutoModelForCausalLM.from_pretrained("allenai/OLMo-7B-0424-hf")
77
+ tokenizer = AutoTokenizer.from_pretrained("allenai/OLMo-7B-0424-hf")
78
  message = ["Language modeling is "]
79
  inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
80
  # optional verifying cuda
 
87
  Alternatively, with the pipeline abstraction:
88
  ```python
89
  from transformers import pipeline
90
+ olmo_pipe = pipeline("text-generation", model="allenai/OLMo-7B-0424-hf")
91
  print(olmo_pipe("Language modeling is "))
92
  >> 'Language modeling is a branch of natural language processing that aims to...'
93
  ```
94
 
95
+ Or, you can make this slightly faster by quantizing the model, e.g. `AutoModelForCausalLM.from_pretrained("allenai/OLMo-7B-0424-hf", torch_dtype=torch.float16, load_in_8bit=True)` (requires `bitsandbytes`).
96
  The quantized model is more sensitive to typing / cuda, so it is recommended to pass the inputs as `inputs.input_ids.to('cuda')` to avoid potential issues.
97
 
 
 
 
 
 
 
98
  ### Fine-tuning
99
  Model fine-tuning can be done from the final checkpoint (the `main` revision of this model) or many intermediate checkpoints. Two recipes for tuning are available.
100
  1. Fine-tune with the OLMo repository:
 
212
 
213
 
214
 
215
+ <!-- ## Environmental Impact
216
 
217
  OLMo 7B variants were either trained on MI250X GPUs at the LUMI supercomputer, or A100-40GB GPUs provided by MosaicML.
218
  A summary of the environmental impact. Further details are available in the paper.
 
220
  | | GPU Type | Power Consumption From GPUs | Carbon Intensity (kg CO₂e/KWh) | Carbon Emissions (tCO₂eq) |
221
  |-----------|------------|-----------------------------|--------------------------------|---------------------------|
222
  | OLMo 7B Twin | MI250X ([LUMI supercomputer](https://www.lumi-supercomputer.eu)) | 135 MWh | 0* | 0* |
223
+ | OLMo 7B | A100-40GB ([MosaicML](https://www.mosaicml.com)) | 104 MWh | 0.656 | 75.05 | -->
224
 
225
  ## Bias, Risks, and Limitations
226