pszemraj commited on
Commit
7ae61f1
1 Parent(s): 7bb2054

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +63 -0
README.md ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - en
5
+ inference:
6
+ parameters:
7
+ max_new_tokens: 64
8
+ do_sample: true
9
+ repetition_penalty: 1.2
10
+ no_repeat_ngram_size: 5
11
+ eta_cutoff: 0.0006
12
+ renormalize_logits: true
13
+ widget:
14
+ - text: My name is El Microondas the Wise and
15
+ example_title: El Microondas
16
+ - text: Kennesaw State University is a public
17
+ example_title: Kennesaw State University
18
+ - text: >-
19
+ Bungie Studios is an American video game developer. They are most famous for
20
+ developing the award winning Halo series of video games. They also made
21
+ Destiny. The studio was founded
22
+ example_title: Bungie
23
+ - text: The Mona Lisa is a world-renowned painting created by
24
+ example_title: Mona Lisa
25
+ - text: >-
26
+ The Harry Potter series, written by J.K. Rowling, begins with the book
27
+ titled
28
+ example_title: Harry Potter Series
29
+ - text: >-
30
+ Question: I have cities, but no houses. I have mountains, but no trees. I
31
+ have water, but no fish. What am I?
32
+
33
+ Answer:
34
+ example_title: Riddle
35
+ - text: The process of photosynthesis involves the conversion of
36
+ example_title: Photosynthesis
37
+ - text: >-
38
+ Jane went to the store to buy some groceries. She picked up apples, oranges,
39
+ and a loaf of bread. When she got home, she realized she forgot
40
+ example_title: Story Continuation
41
+ - text: >-
42
+ Problem 2: If a train leaves Station A at 9:00 AM and travels at 60 mph, and
43
+ another train leaves Station B at 10:00 AM and travels at 80 mph, when will
44
+ they meet if the distance between the stations is 300 miles?
45
+
46
+ To determine
47
+ example_title: Math Problem
48
+ - text: In the context of computer programming, an algorithm is
49
+ example_title: Algorithm Definition
50
+ pipeline_tag: text-generation
51
+ tags:
52
+ - smol_llama
53
+ - llama2
54
+ ---
55
+
56
+
57
+ # smol_llama-101M-GQA
58
+
59
+ A small 101M param (total) decoder model. This is the first version of the model.
60
+
61
+ - 768 hidden size, 6 layers
62
+ - GQA (24 heads, 8 key-value), context length 1024
63
+ - train-from-scratch