File size: 11,539 Bytes
94c1ea9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
---
widget:
- text: Once upon a time,
  example_title: English
- text: भारत की राजधानी
  example_title: Hindi
- text: ভারত বৈচিত্র্যের দিকে যাচ্ছিল
  example_title: Bangla
- text: ભારત વિવિધતા તરફ જઈ રહ્યું હતું
  example_title: Gujrati
pipeline_tag: text2text-generation
inference:
  parameters:
    max_new_tokens: 200
license: apache-2.0
datasets:
- soketlabs/bhasha-wiki
- soketlabs/bhasha-wiki-indic
- cerebras/SlimPajama-627B
- ai4bharat/sangraha
language:
- hi
- bn
- gu
- en
tags:
- indic
---
# Pragna-1b

<!-- Provide a quick summary of what the model is/does. -->

![pragna-1b on huggingface](pragna_hf.png)


## Architecture Overview

Pragna-1B is a decoder-only transformer model inspired by TinyLlama, featuring the following specifications:

- Layers: 22
- Attention Heads: 32
- Context Length: 2048
- Hidden Dimension: 2048
- Expansion Dimension: 5632
- Vocabulary Size: 69632

This model incorporates Rotary Positional Encoding to infuse positional information into the embeddings, utilising a base of 10,000. It employs RSNorm with an epsilon value of 1e-5 and the Sigmoid Activation Unit (SiLU) as the activation function. Additionally, Pragna-1B adopts Grouped Query Attention, an alternative to Multi-Head Attention, which enhances training and inference speed while reducing memory bandwidth. This also supports the use of lower-compute devices for inference tasks.

Pragna-1B is trained on our proprietary platform, GenAI Studio, a modular AI Developer Platform designed to support any GenAI model architecture. It is capable of scaling across thousands of GPUs or accelerators and is built to be fault-tolerant. The development of this model leveraged Triton, an open-source language from OpenAI, for crafting high-performance custom fused CUDA Kernels for various operations. Furthermore, the model uses Fully Sharded Data Parallel (FSDP) for distributed and parallel training and incorporates the state-of-the-art FlashAttention2 to accelerate training and inference.

### Model Description

<!-- Provide a longer summary of what this model is. -->

- **Developed by:** [Soket AI Labs](http://soket.ai)
- **Language(s) (NLP):** Hindi, Bangla, Gujarati and English
- **License:** Apache 2.0

## Bias, Risks, and Limitations

<!-- This section is meant to convey both technical and sociotechnical limitations. -->

[More Information Needed]

## How to Get Started with the Model

Use the code below to get started with the model.
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("soketlabs/pragna-1b")
model = AutoModelForCausalLM.from_pretrained(
    "soketlabs/pragna-1b", torch_dtype=torch.bfloat16
)
```

## Training Details

### Training Data

1. [Bhasha-wiki](https://soket.ai/blogs/bhasha_wiki_dataset)
2. [SlimPajama](https://huggingface.co/datasets/cerebras/SlimPajama-627B)
3. [Sangraha-Verified](https://huggingface.co/datasets/ai4bharat/sangraha)

### Training Procedure

[To be added]


#### Training Hyperparameters

- **Precision:** BFloat16
- **Batch Size:** 2k - 2.5k
- **Context Length:** 2,048
- **Learning Rate:** 3e-5
- **Optimizer:** AdamW
- **LR Scheduler:** Cosine
- **Mixed Precision Training**

## Evaluation

### Hindi
|  | Arc-Easy | Arc-Challenge | Hellaswag | Average |
|--------------|----------|---------------|-----------|---------|
| pragna-1b | 0.33 | 0.22 | 0.35 | 0.30 |
| sarvamai/OpenHathi-7B-Hi-v0.1-Base | 0.3582 | 0.2645 | 0.4315 | 0.35 |
| meta-llama/Llama-2-7b-hf | 0.295 | 0.2406 | 0.3789 | 0.30 |
| google/gemma-7b | <b>0.5926</b> | <b>0.4258</b> | <b>0.6341</b> | <b>0.55</b> |
| meta-llama/Meta-Llama-3-8B | 0.5354 | 0.3541 | 0.6072 | 0.50 |

### Gujarati
|  | Arc-Easy | Arc-Challenge | Hellaswag | Average |
|--------------|----------|---------------|-----------|---------|
| pragna-1b | 0.32 | 0.22 | 0.37 | 0.30 |
| google/gemma-7b | <b>0.4954</b> | <b>0.3208</b> | <b>0.5673</b> | <b>0.46</b> |


### English

|  | Arc | Hellaswag | MMLU | TruthfulQA | Winogrande | GSM8K | Average |
|----------------|-----|-----------|------|------------|------------|-------|---------|
| pragna-1b | 0.3 | 0.51 | 0.27 | 0.38 | 0.56 | 0 | 0.34 |
| TinyLlama/TinyLlama-1.1B-Chat-v1.0 | 0.36 | 0.61 | 0.25 | 0.37 | 0.61 | 0.02 | 0.37 |
| meta-llama/Meta-Llama-3-8B-Instruct | <b>0.6</b> | <b>0.82</b> | <b>0.67</b> | <b>0.51</b> | <b>0.77</b> | <b>0.68</b> | <b>0.66</b> |
| meta-llama/Llama-2-7b-hf | 0.53 | 0.78 | 0.46 | 0.39 | 0.74 | 0.14 | 0.51 |


### Results

| Context | Tokens Predicted |
|---------|------------------|
| India is one of the most | India is one of the most exciting places to travel to in Asia. The country hosts a wide variety of cultures and offers a plethora of attractions across the country. There's a lot you can do here and the tourists are always on the lookout for the new things to do and see. If you're looking to travel to India for your next adventure, you'll be pleased to know that you can find a ton of things to do in the country. From the temples to the beaches, there's a lot you can see and do here. One of the best things for visitors to do in India is to visit Jaipur. The capital city of Rajputana, Jaipur is a beautiful city that's known as the "City of Lives". |
| नरेंद्र मोदी का जन्म | नरेंद्र मोदी का जन्मदिन आज मनाया जा रहा है। इस मौके पर प्रधानमंत्री नरेंद्र मोदी ने ट्वीट कर लिखा कि 'आज देश को एक नया संसद भवन मिला है, जिसे आप लोगों ने बधाई दी है। भारत और भारतवासियों को यह संसद बहुत ही महत्त्वपूर्ण है और इसका उपयोग देश के विकास में बहुत अधिक योगदान देगा। आज हमारे देश में एक संसद बनी हुई है जो हमारे देश की लोकतांत्रिक प्रक्रियाओं को मजबूत बनाती है। |
| Prime minister Modi | Prime minister Modi's visit to China: PM Modi is scheduled to arrive in Beijing on Thursday. PM Narendra Modi will be the first Indian prime minister to visit China since former President Mao Zedong, who visited India in 1959 and was assassinated by Chinese soldiers a year later. The two leaders are expected to discuss regional security issues such as terrorism, trade and energy cooperation during their talks. |
| महात्मा गांधी | महात्मा गांधी एक स्वतंत्र व्यक्ति, एक आत्मनिर्भर और एक अत्यंत लोकप्रिय नेता थे। उन्होंने अपने जीवनकाल में देश को एक नई दिशा देने के लिए बहुत कुछ किया। समाज के विभिन्न क्षेत्रों में उनके योगदान के लिए उन्हें सम्मानित किया गया था। |
| ब्रिटिश राजशाही | ब्रिटिश राजशाही के विघटन के बाद, इंग्लैंड और वेल्स में ब्रिटेन की संसद का गठन हुआ। संसद ने 1807 में अपने पहले अधिनियम, "द क्राउन की विधेयक" (एक्ट ऑफ द डेमोक्रैटिक) को मंजूरी दी, जिसमें प्रत्येक राज्य से एक निर्वाचित सदस्य चुना गया। |
| মহাত্মা গান্ধী | মহাত্মা গান্ধী ছিলেন আন্তর্জাতিক সমাজতান্ত্রিক আন্দোলনের অন্যতম প্রধান নেতা। তিনি ভারতের স্বাধীনতা সংগ্রামে তাঁর ভূমিকা অপরিসীম। ১৯৪৭ সালের ১৫ আগস্ট ভারতে বিপ্লব ঘটে। ১৯৪৮ সালে মহাত্মা গান্ধীর মৃত্যু হয়। এরপর থেকে মহাত্মা আর দেখা যায় না। |
| নরেন্দ্র মোদীর জন্ম | নরেন্দ্র মোদীর জন্মদিনের শুভেচ্ছা জানিয়েছেন অমিতাভ বচ্চন। প্রসঙ্গত, গোটা দেশ জুড়ে চলছে নববর্ষ উৎসব। আর এই উৎসবে নিজেকে সামিল করতে চান না কেউই। এমনটাই জানালেন বলিউড অভিনেত্রী সারা আলি খান। |
| એક વખત | ક વખતે, તેમણે તેમના મિત્ર અને સહાયક વિલિયમ હેનરી મેકડોનાલ્ડ સાથે મળીને "ધ ઓરિજિનલ મેન ઓફ ધ ડેડ" નામના પુસ્તકનું નિર્માણ કર્યું હતું; તે 1961માં પ્રકાશિત થયું હતું. 1958માં, તેમણે ધ ન્યૂ યોર્કર માટે લેખક અને સંપાદક પિતા તરીકે કામ કર્યું હતું. |
| એક સમયે એક રાજા જંગલમાં રહેતો હતો. તેમણે | એક સમયે એક રાજા જંગલમાં રહેતો હતો. તેમણે, એક દિવસ રાજાને કહ્યું, "હું તમને મારી જાતને એક સ્ત્રી તરીકે આપું છું." રાજા ખુશ થયો. તેમણે તેને તેના ઘરમાં લઈ ગયા. તે દિવસે રાણી પોતાના પતિ સાથે ઘરે આવી અને કહ્યું કે તે તેની પાસે છે. રાજાએ તેને પૂછ્યું કે શું તેણે તેની સાથે લગ્ન કર્યા છે ? રાજા અને રાણીએ પૂછ્યું - હા. ત્યારે તેમણે કહ્યું કે, "તમે મને તમારી સાથે લગ્ન કરવાનું વચન આપ્યું છે. હવે મારે તમારા પર વિશ્વાસ રાખવો જોઈએ નહીં". |


## Citation

<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->

```
@ONLINE{bhasha-wiki,
    author = "Soket Labs Technology and Research Private Limited",
    title  = "pragna-1b",
    url    = "https://soket.ai"
}
```

## Model Card Contact

[email protected]