hishab
/

titulm-mpt-1b-v1.0

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

sagorsarker commited on Apr 2

Commit

60c73ee

•

1 Parent(s): 0c58c9f

Update README.md

Files changed (1) hide show

README.md +32 -1

README.md CHANGED Viewed

@@ -12,6 +12,37 @@ pipeline_tag: text-generation
 TituLM-1B-BN-V1 is a large language model specifically trained for generating and understanding Bangla text. Utilizing a decoder-style transformer architecture, this model has been extensively trained on a dataset comprising 4.51 billion Bangla tokens. This model is the part of iterative train and release Bangla LLM from Hishab.
 ## Training
-The training process was managed using the robust framework provided by MosaicML's [llm-foundry](https://github.com/mosaicml/llm-foundry) repository. Throughout the training phase, titulm-1b-bn-v1 underwent a total of 59 iterations, allowing for iterative refinements and optimization
 ## Datasets

 TituLM-1B-BN-V1 is a large language model specifically trained for generating and understanding Bangla text. Utilizing a decoder-style transformer architecture, this model has been extensively trained on a dataset comprising 4.51 billion Bangla tokens. This model is the part of iterative train and release Bangla LLM from Hishab.
 ## Training
+The training process was managed using the robust framework provided by MosaicML's [llm-foundry](https://github.com/mosaicml/llm-foundry) repository. Throughout the training phase, titulm-1b-bn-v1 underwent a total of 59 iterations, allowing for iterative refinements and optimization.
+Notable training configs:
+- n_nead: 16
+- n_layers: 24
+- max_sequence_length: 2048
+- vocab_size: 72000
+- attn_impl: flash
 ## Datasets
+We add Bangla text datasets from several sources including
+- Culturax
+- Books
+- Bangla Wikipedia
+- Banglapedia
+- News articles
+Our total data size is 58 GB of deduplicated data with 4.51 billion tokens tokenized by our sentencepiece model.
+## How to Use
+The basic use cases to generate text using this model is simple. Follow the below code to generate text using this model.
+Install the following library before running the code:
+- pip install transofrmers
+- pip install einops
+- pip install accelerate
+```py
+# code will add soon
+```