Update README.md
Browse files
README.md
CHANGED
@@ -1,7 +1,7 @@
|
|
1 |
---
|
2 |
license: cc-by-nc-sa-4.0
|
3 |
---
|
4 |
-
# StellarX: A Base Model by Dampish
|
5 |
|
6 |
StellarX is a powerful autoregressive language model designed for various natural language processing tasks. It has been trained on a massive dataset containing 810 billion tokens, trained on "redpajama," and is built upon the popular GPT-NeoX architecture. With approximately 4 billion parameters, StellarX offers exceptional performance and versatility.
|
7 |
|
@@ -11,7 +11,7 @@ StellarX is a powerful autoregressive language model designed for various natura
|
|
11 |
- **Model Architecture:** StellarX is built upon the GPT-NeoX architecture, which may, be, inspired by GPT-3 and shares similarities with GPT-J-6B. The architecture incorporates key advancements in transformer-based language models, ensuring high-quality predictions and contextual understanding.
|
12 |
- **Model Size:** StellarX consists of approximately 4 billion parameters, making it a highly capable language model for a wide range of natural language processing tasks.
|
13 |
- **Carbon-Friendly and Resource-Efficient:** StellarX has been optimized for carbon efficiency and can be comfortably run on local devices. When loaded in 8 bits, the model requires only about 5GB of storage, making it more accessible and convenient for various applications.
|
14 |
-
- **V0** Meaning what version it is on, currently version 0,
|
15 |
|
16 |
## How to Use
|
17 |
|
|
|
1 |
---
|
2 |
license: cc-by-nc-sa-4.0
|
3 |
---
|
4 |
+
# StellarX: A Base Model by Dampish and Arkane
|
5 |
|
6 |
StellarX is a powerful autoregressive language model designed for various natural language processing tasks. It has been trained on a massive dataset containing 810 billion tokens, trained on "redpajama," and is built upon the popular GPT-NeoX architecture. With approximately 4 billion parameters, StellarX offers exceptional performance and versatility.
|
7 |
|
|
|
11 |
- **Model Architecture:** StellarX is built upon the GPT-NeoX architecture, which may, be, inspired by GPT-3 and shares similarities with GPT-J-6B. The architecture incorporates key advancements in transformer-based language models, ensuring high-quality predictions and contextual understanding.
|
12 |
- **Model Size:** StellarX consists of approximately 4 billion parameters, making it a highly capable language model for a wide range of natural language processing tasks.
|
13 |
- **Carbon-Friendly and Resource-Efficient:** StellarX has been optimized for carbon efficiency and can be comfortably run on local devices. When loaded in 8 bits, the model requires only about 5GB of storage, making it more accessible and convenient for various applications.
|
14 |
+
- **V0** Meaning what version it is on, currently version 0, Assume version 0 has only been trained on 300B tokens and the goal is 810B tokens. The next version aims to have a way higher accuracy.
|
15 |
|
16 |
## How to Use
|
17 |
|