File size: 2,310 Bytes
e4a0d50
7445cc7
 
e4a0d50
7445cc7
 
c887bc8
7445cc7
c887bc8
 
 
 
7445cc7
 
c887bc8
7445cc7
c887bc8
7445cc7
 
c887bc8
7445cc7
49008bd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
---
library_name: peft
base_model: WeOpenML/PandaLM-7B-v1
---


Model Details

    Original Model: WeOpenML/PandaLM-7B-v1
    Fine-Tuned For: Azerbaijani language understanding and generation
    Dataset Used: Azerbaijani translation of the Stanford Alpaca dataset
    Fine-Tuning Method: Self-instruct method


This model, is part of the ["project/Barbarossa"](https://github.com/Alas-Development-Center/project-barbarossa) initiative, aimed at enhancing natural language processing capabilities for the Azerbaijani language. By fine-tuning this model on the Azerbaijani translation of the Stanford Alpaca dataset using the self-instruct method, we've made significant strides in improving AI's understanding and generation of Azerbaijani text.

__Our primary objective with this model is to offer insights into the feasibility and outcomes of fine-tuning large language models (LLMs) for the Azerbaijani language. The fine-tuning process was undertaken with limited resources, providing valuable learnings rather than creating a model ready for production use. Therefore, we recommend treating this model as a reference or a guide to understanding the potential and challenges involved in fine-tuning LLMs for specific languages. It serves as a foundational step towards further research and development rather than a direct solution for production environments.__


This project is a proud product of the [Alas Development Center (ADC)](https://az.linkedin.com/company/alas-development-center?trk=ppro_cprof). We are thrilled to offer these finely-tuned large language models to the public, free of charge. 


How to use?

```
from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer, pipeline

model_path = "alasdevcenter/az-pandalm"

model = AutoModelForCausalLM.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)

pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=200)

instruction = "Təbiətin qorunması  "
formatted_prompt = f"""Aşağıda daha çox kontekst təmin edən təlimat var. Sorğunu adekvat şəkildə tamamlayan cavab yazın.
                ### Təlimat:
                {instruction}
                ### Cavab:
                """

result = pipe(formatted_prompt)
print(result[0]['generated_text'])
```