sagorsarker commited on
Commit
44d2a6f
1 Parent(s): e83a1c6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -25,6 +25,9 @@ Notable training configs:
25
 
26
 
27
  ## Datasets
 
 
 
28
 
29
 
30
 
 
25
 
26
 
27
  ## Datasets
28
+ Datasets comprise Bangla, English, and Codes data. We mixed Bangla data with English Redpajama (C4, Github, StackExchange, Book, Arxiv, Wikipedia) data.
29
+
30
+ Token-wise distribution will be added soon below.
31
 
32
 
33