Edit model card

Uploaded model

Developed by: umarigan
License: apache-2.0
Finetuned from model : Trendyol/Trendyol-LLM-7b-chat-v1.0

This mistral model was trained 2x faster with Unsloth and Huggingface's TRL library.

I used 10k pair of Turkish rlhf dataset.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 16
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.1
num_epochs: 1

Training results

Framework versions

PEFT 0.9.0
Transformers 4.38.2
Pytorch 2.2.1+cu121
Datasets 2.18.0
Tokenizers 0.15.2

Downloads last month: 0

Safetensors

Model size

7.34B params

Tensor type

BF16

·

Inference Examples

Question Answering

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for umarigan/Trendyol-LLM-7b-chat-v1.0-RLHF

Base model

mistralai/Mistral-7B-v0.1

Finetuned

Trendyol/Trendyol-LLM-7b-base-v1.0

Finetuned

Trendyol/Trendyol-LLM-7b-chat-v1.0

Adapter

(4)

this model

Collection including umarigan/Trendyol-LLM-7b-chat-v1.0-RLHF

TR Models

5 items • Updated May 19 • 1