|
--- |
|
license: mit |
|
language: |
|
- en |
|
base_model: |
|
- meta-llama/Llama-3.2-1B |
|
--- |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
<!-- Provide a longer summary of what this model is. --> |
|
|
|
Finetune of LLaMa 3.2 1B model to include flashnormalization (https://arxiv.org/abs/2407.09577) |
|
|
|
|
|
- **Developed by:** OpenMachine Labs |
|
- **License:** MIT |
|
- **Finetuned from model** Meta LLaMa 3.2 1B |
|
|
|
### Model Sources [optional] |
|
|
|
<!-- Provide the basic links for the model. --> |
|
|
|
- **Repository:** https://github.com/meta-llama/llama-models/tree/main/models/llama3_2 |
|
- **Paper** https://ai.meta.com/blog/llama-3-2-connect-2024-vision-edge-mobile-devices/ |
|
|
|
## Uses |
|
|
|
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. --> |
|
|
|
## How to Get Started with the Model |
|
|
|
Use the code below to get started with the model. |
|
|
|
#### Speeds, Sizes, Times |
|
|
|
<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. --> |
|
|
|
[More Information Needed] |
|
|
|
## Evaluation |
|
|
|
<!-- This section describes the evaluation protocols and provides the results. --> |
|
|
|
[More Information Needed] |
|
|
|
#### Metrics |
|
|
|
<!-- These are the evaluation metrics being used, ideally with a description of why. --> |
|
|
|
[More Information Needed] |
|
|
|
### Results |
|
|
|
[More Information Needed] |
|
|
|
#### Summary |
|
|
|
|
|
|
|
## Model Examination [optional] |
|
|
|
<!-- Relevant interpretability work for the model goes here --> |
|
|
|
|
|
## Model Card Authors |
|
Nils Graef ([email protected]) |
|
|
|
Drew Wasielewski ([email protected]) |
|
|