osanseviero commited on
Commit
800d65a
1 Parent(s): 6d01879

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -6
README.md CHANGED
@@ -18,12 +18,14 @@ Code Llama is a collection of pretrained and fine-tuned generative text models r
18
 
19
  ## Model Use
20
 
21
- To use this model, please make sure to install transformers from `main` until the next version is released:
22
 
23
  ```bash
24
- pip install git+https://github.com/huggingface/transformers.git@main accelerate
25
  ```
26
 
 
 
27
  Model capabilities:
28
 
29
  - [x] Code completion.
@@ -36,7 +38,7 @@ Model capabilities:
36
 
37
  **Model Developers** Meta
38
 
39
- **Variations** Code Llama comes in three model sizes, and three variants:
40
 
41
  * Code Llama: base models designed for general code synthesis and understanding
42
  * Code Llama - Python: designed specifically for Python
@@ -50,9 +52,9 @@ All variants are available in sizes of 7B, 13B, 34B, and 70B parameters.
50
 
51
  **Output** Models generate text only.
52
 
53
- **Model Architecture** Code Llama is an auto-regressive language model that uses an optimized transformer architecture.
54
 
55
- **Model Dates** Code Llama and its variants have been trained between January 2023 and July 2023.
56
 
57
  **Status** This is a static model trained on an offline dataset. Future versions of Code Llama - Instruct will be released as we improve model safety with community feedback.
58
 
@@ -66,7 +68,7 @@ All variants are available in sizes of 7B, 13B, 34B, and 70B parameters.
66
  **Out-of-Scope Uses** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in languages other than English. Use in any other way that is prohibited by the Acceptable Use Policy and Licensing Agreement for Code Llama and its variants.
67
 
68
  ## Hardware and Software
69
- **Training Factors** We used custom training libraries. The training and fine-tuning of the released models have been performed Meta’s Research Super Cluster.
70
 
71
  ## Evaluation Results
72
 
 
18
 
19
  ## Model Use
20
 
21
+ Install `transformers`
22
 
23
  ```bash
24
+ pip install transformers accelerate
25
  ```
26
 
27
+ **Warning:** The 70B Instruct model has a different prompt template than the smaller versions. We'll update this repo soon.
28
+
29
  Model capabilities:
30
 
31
  - [x] Code completion.
 
38
 
39
  **Model Developers** Meta
40
 
41
+ **Variations** Code Llama comes in four model sizes, and three variants:
42
 
43
  * Code Llama: base models designed for general code synthesis and understanding
44
  * Code Llama - Python: designed specifically for Python
 
52
 
53
  **Output** Models generate text only.
54
 
55
+ **Model Architecture** Code Llama is an auto-regressive language model that uses an optimized transformer architecture. It was fine-tuned with up to 16k tokens. This variant **does not** support long context of up to 100k tokens.
56
 
57
+ **Model Dates** Code Llama and its variants have been trained between January 2023 and January 2024.
58
 
59
  **Status** This is a static model trained on an offline dataset. Future versions of Code Llama - Instruct will be released as we improve model safety with community feedback.
60
 
 
68
  **Out-of-Scope Uses** Use in any manner that violates applicable laws or regulations (including trade compliance laws). Use in languages other than English. Use in any other way that is prohibited by the Acceptable Use Policy and Licensing Agreement for Code Llama and its variants.
69
 
70
  ## Hardware and Software
71
+ **Training Factors** We used custom training libraries. The training and fine-tuning of the released models have been performed Meta’s Research Super Cluster.\\**Carbon Footprint** In aggregate, training all 12 Code Llama models required 1400K GPU hours of computation on hardware of type A100-80GB (TDP of 350-400W). Estimated total emissions were 228.55 tCO2eq, 100% of which were offset by Meta’s sustainability program.
72
 
73
  ## Evaluation Results
74