Ellight commited on
Commit
9361b08
1 Parent(s): 6969a91

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -1
README.md CHANGED
@@ -23,4 +23,32 @@ This gemma model was trained 2x faster with [Unsloth](https://github.com/unsloth
23
 
24
  # Hindi-Gemma-2B-instruct (Instruction-tuned)
25
 
26
- Hindi-Gemma-2B-instruct is an instruction-tuned Hindi large language model (LLM) with 2 billion parameters, and it is based on Gemma 2B.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
  # Hindi-Gemma-2B-instruct (Instruction-tuned)
25
 
26
+ Hindi-Gemma-2B-instruct is an instruction-tuned Hindi large language model (LLM) with 2 billion parameters, and it is based on Gemma 2B.
27
+
28
+ # TO do inference using the LORA adapters
29
+
30
+ from unsloth import FastLanguageModel
31
+ model, tokenizer = FastLanguageModel.from_pretrained(
32
+ model_name = "Ellight/gemma-2b-bnb-4bit", # YOUR MODEL YOU USED FOR TRAINING
33
+ max_seq_length = max_seq_length,
34
+ dtype = dtype,
35
+ load_in_4bit = load_in_4bit,
36
+ )
37
+ FastLanguageModel.for_inference(model) # Enable native 2x faster inference
38
+
39
+ alpaca_prompt = """
40
+ ### Instruction:
41
+ {}
42
+
43
+ ### Response:
44
+ {}"""
45
+ inputs = tokenizer(
46
+ [
47
+ alpaca_prompt.format(
48
+ "शतरंज बोर्ड पर कितने वर्ग होते हैं?", # instruction
49
+ "", # output - leave this blank for generation!
50
+ )
51
+ ], return_tensors = "pt").to("cuda")
52
+
53
+ outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True)
54
+ tokenizer.batch_decode(outputs)