|
--- |
|
language: |
|
- bn |
|
license: apache-2.0 |
|
datasets: |
|
- uonlp/CulturaX |
|
- wikipedia |
|
pipeline_tag: text-generation |
|
--- |
|
|
|
# TituLM-1B-BN-V1 |
|
|
|
TituLM-1B-BN-V1 is a large language model specifically trained for generating and understanding Bangla text. Utilizing a decoder-style transformer architecture, this model has been extensively trained on a dataset comprising 4.51 billion Bangla tokens. This model is the part of iterative train and release Bangla LLM from Hishab. |
|
|
|
## Training |
|
The training process was managed using the robust framework provided by MosaicML's [llm-foundry](https://github.com/mosaicml/llm-foundry) repository. Throughout the training phase, titulm-1b-bn-v1 underwent a total of 59 iterations, allowing for iterative refinements and optimization. |
|
Notable training configs: |
|
|
|
- n_nead: 16 |
|
- n_layers: 24 |
|
- max_sequence_length: 2048 |
|
- vocab_size: 72000 |
|
- attn_impl: flash |
|
|
|
__Training evaluation status__ |
|
|
|
- Evaluation CrossEntropy Loss |
|
|
|
Final loss: 3.11 |
|
<img src="https://cdn-uploads.huggingface.co/production/uploads/5f40b34279c1ba4c353d0c7a/Mr0yAg9AfXTm15GATgSTN.png" alt="alt text" width="620" height="620"> |
|
|
|
- Language Perplexity |
|
|
|
Final Perplexity: 22.562 |
|
<img src="https://cdn-uploads.huggingface.co/production/uploads/5f40b34279c1ba4c353d0c7a/B-ZC1LfFZdCTO25Twcyth.png" alt="alt text" width="620" height="620"> |
|
|
|
## Datasets |
|
We add Bangla text datasets from several sources including |
|
|
|
- Culturax |
|
- Books |
|
- Bangla Wikipedia |
|
- Banglapedia |
|
- News articles |
|
|
|
Our total data size is 58 GB of deduplicated data with 4.51 billion tokens tokenized by our sentencepiece model. |
|
|
|
|
|
## How to Use |
|
The basic use cases to generate text using this model is simple. Follow the below code to generate text using this model. |
|
|
|
Install the following library before running the code: |
|
|
|
- pip install transofrmers |
|
- pip install einops |
|
- pip install accelerate |
|
|
|
```py |
|
# code will add soon |
|
``` |