metadata
license: apache-2.0
model-index:
- name: tigerbot-7b-sft
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: AI2 Reasoning Challenge (25-Shot)
type: ai2_arc
config: ARC-Challenge
split: test
args:
num_few_shot: 25
metrics:
- type: acc_norm
value: 41.64
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TigerResearch/tigerbot-7b-sft
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: HellaSwag (10-Shot)
type: hellaswag
split: validation
args:
num_few_shot: 10
metrics:
- type: acc_norm
value: 60.56
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TigerResearch/tigerbot-7b-sft
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU (5-Shot)
type: cais/mmlu
config: all
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 29.89
name: accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TigerResearch/tigerbot-7b-sft
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: TruthfulQA (0-shot)
type: truthful_qa
config: multiple_choice
split: validation
args:
num_few_shot: 0
metrics:
- type: mc2
value: 58.18
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TigerResearch/tigerbot-7b-sft
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: Winogrande (5-shot)
type: winogrande
config: winogrande_xl
split: validation
args:
num_few_shot: 5
metrics:
- type: acc
value: 63.54
name: accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TigerResearch/tigerbot-7b-sft
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GSM8k (5-shot)
type: gsm8k
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 6.29
name: accuracy
source:
url: >-
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=TigerResearch/tigerbot-7b-sft
name: Open LLM Leaderboard
A cutting-edge foundation for your very own LLM.
🌐 TigerBot • 🤗 Hugging Face
Github
https://github.com/TigerResearch/TigerBot
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
from accelerate import infer_auto_device_map, dispatch_model
from accelerate.utils import get_balanced_memory
tokenizer = AutoTokenizer.from_pretrained("TigerResearch/tigerbot-7b-sft-v1")
model = AutoModelForCausalLM.from_pretrained("TigerResearch/tigerbot-7b-sft-v1")
max_memory = get_balanced_memory(model)
device_map = infer_auto_device_map(model, max_memory=max_memory, no_split_module_classes=["BloomBlock"])
model = dispatch_model(model, device_map=device_map, offload_buffers=True)
device = torch.cuda.current_device()
tok_ins = "\n\n### Instruction:\n"
tok_res = "\n\n### Response:\n"
prompt_input = tok_ins + "{instruction}" + tok_res
input_text = "What is the next number after this list: [1, 2, 3, 5, 8, 13, 21]"
input_text = prompt_input.format_map({'instruction': input_text})
max_input_length = 512
max_generate_length = 1024
generation_kwargs = {
"top_p": 0.95,
"temperature": 0.8,
"max_length": max_generate_length,
"eos_token_id": tokenizer.eos_token_id,
"pad_token_id": tokenizer.pad_token_id,
"early_stopping": True,
"no_repeat_ngram_size": 4,
}
inputs = tokenizer(input_text, return_tensors='pt', truncation=True, max_length=max_input_length)
inputs = {k: v.to(device) for k, v in inputs.items()}
output = model.generate(**inputs, **generation_kwargs)
answer = ''
for tok_id in output[0][inputs['input_ids'].shape[1]:]:
if tok_id != tokenizer.eos_token_id:
answer += tokenizer.decode(tok_id)
print(answer)
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 43.35 |
AI2 Reasoning Challenge (25-Shot) | 41.64 |
HellaSwag (10-Shot) | 60.56 |
MMLU (5-Shot) | 29.89 |
TruthfulQA (0-shot) | 58.18 |
Winogrande (5-shot) | 63.54 |
GSM8k (5-shot) | 6.29 |