Text Generation
Transformers
PyTorch
llama
text-generation-inference
Inference Endpoints
librarian-bot's picture
Librarian Bot: Update Hugging Face dataset ID
ad8095e verified
|
raw
history blame
556 Bytes
metadata
language:
  - en
  - zh
  - ko
  - ja
  - fr
license: apache-2.0
datasets:
  - CaterinaLac/sharegpt-deduplicated
  - mhardalov/exams
  - Open-Orca/OpenOrca

This model is a Llama2-7B model finetuned on the union of ShareGPT, the exams dataset and a subset of the Orca dataset. The finetuning was performed with DeepSpeed Chat toolkit (step 1, sft). The model run for three epochs before reaching a plateau on the validation dataset. We used a cosine scheduler, with an initial LR of 2e-5.