caliburn-12b / README.md

Update README.md

45b7bb0 verified about 2 months ago

5.63 kB

	---
	license: mit
	library_name: transformers
	model-index:
	- name: caliburn-12b
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: IFEval (0-Shot)
	type: HuggingFaceH4/ifeval
	args:
	num_few_shot: 0
	metrics:
	- type: inst_level_strict_acc and prompt_level_strict_acc
	value: 35.76
	name: strict accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Xclbr7/caliburn-12b
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: BBH (3-Shot)
	type: BBH
	args:
	num_few_shot: 3
	metrics:
	- type: acc_norm
	value: 35.64
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Xclbr7/caliburn-12b
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MATH Lvl 5 (4-Shot)
	type: hendrycks/competition_math
	args:
	num_few_shot: 4
	metrics:
	- type: exact_match
	value: 9.67
	name: exact match
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Xclbr7/caliburn-12b
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: GPQA (0-shot)
	type: Idavidrein/gpqa
	args:
	num_few_shot: 0
	metrics:
	- type: acc_norm
	value: 11.52
	name: acc_norm
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Xclbr7/caliburn-12b
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MuSR (0-shot)
	type: TAUR-Lab/MuSR
	args:
	num_few_shot: 0
	metrics:
	- type: acc_norm
	value: 13.78
	name: acc_norm
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Xclbr7/caliburn-12b
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MMLU-PRO (5-shot)
	type: TIGER-Lab/MMLU-Pro
	config: main
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 29.72
	name: accuracy
	source:
	url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Xclbr7/caliburn-12b
	name: Open LLM Leaderboard
	---

	# caliburn 12b-merged

	<!-- Provide a quick summary of what the model is/does. -->

	This model is a 12 billion parameter language model created by merging multiple existing models using the MergeKit library. It is designed for general text generation tasks.

	## Model Details

	### Model Description

	<!-- Provide a longer summary of what this model is. -->

	This is a large language model with 12 billion parameters, created by merging multiple pre-existing models using the MergeKit library. The model is based on the transformer architecture and is fine-tuned for general text generation tasks.

	- Developed by: The user who created this merged model
	- Model type: Transformer-based language model
	- Language(s) (NLP): English
	- License: MIT
	- Finetuned from model: Multiple source models merged using MergeKit

	### Model Sources [optional]

	<!-- Provide the basic links for the model. -->

	- Repository: [More Information Needed]
	- Paper [optional]: N/A
	- Demo [optional]: N/A

	## [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Xclbr7__caliburn-12b)

	\| Metric \|Value\|
	\|-------------------\|----:\|
	\|Avg. \|22.68\|
	\|IFEval (0-Shot) \|35.76\|
	\|BBH (3-Shot) \|35.64\|
	\|MATH Lvl 5 (4-Shot)\| 9.67\|
	\|GPQA (0-shot) \|11.52\|
	\|MuSR (0-shot) \|13.78\|
	\|MMLU-PRO (5-shot) \|29.72\|

	### Direct Use

	This model can be used for various natural language processing tasks, including:

	- Text generation
	- Code completion
	- Question answering
	- Summarization

	### Downstream Use [optional]

	The model can be fine-tuned for specific tasks or domains to improve performance on targeted applications.

	### Out-of-Scope Use

	This model should not be used for generating harmful, biased, or unethical content. It should not be relied upon for critical decision-making without human oversight.

	## Bias, Risks, and Limitations

	- The model may inherit biases present in its training data or source models.
	- It may generate incorrect or nonsensical information.
	- The model's outputs should be carefully reviewed and fact-checked.

	### Recommendations

	Users should be aware of the model's limitations and potential biases. It's recommended to use the model with appropriate content filtering and human oversight, especially for public-facing applications.

	## How to Get Started with the Model

	Use the following code to get started with the model:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	tokenizer = AutoTokenizer.from_pretrained("./models/12b-merged")
	model = AutoModelForCausalLM.from_pretrained("./models/12b-merged", torch_dtype=torch.float16).to("cuda")

	prompt = "Your prompt here"
	inputs = tokenizer(prompt, return_tensors="pt")
	outputs = model.generate(**inputs.to("cuda"), max_new_tokens=100)
	result = tokenizer.batch_decode(outputs, skip_special_tokens=True)
	print(result)