Update README.md

93213d2 verified 4 months ago

4.47 kB

	---
	base_model: unsloth/mistral-7b-v0.3-bnb-4bit
	language:
	- en
	license: apache-2.0
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- mistral
	- gguf
	---

	# Uploaded model

	- Developed by: Deeokay
	- License: apache-2.0
	- Finetuned from model : unsloth/mistral-7b-v0.3-bnb-4bit

	This mistral model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

	[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)


	# README

	This is a test model on a the following
	- a private dataset
	- slight customization on alpaca chat template
	- Works with Ollama create but requires customization to Modelfile
	- One reason for this was wanted to try doing Q2_K and see if it was actually good(?) -> Exceeds Expectation!!
	- My examples will be based on unslot.Q2_K.GGUF file, however other quantization should work as well

	# HOW TO USE

	The whole point of conversion for me was I wanted to be able to to use it through Ollama or (other local options)
	For Ollama, it required to be a GGUF file. Once you have this it is pretty straight forward

	If you want to try it first, the Q2_K version of this is available in Ollama => deeokay/minimistral

	```python
	ollama pull deeokay/minimistral
	```


	# Quick Start:
	- You must already have Ollama running in your setting
	- Download the unsloth.Q2_K.gguf model from Files
	- In the same directory create a file call "Modelfile"
	- Inside the "Modelfile" type

	```python
	FROM ./mistrial_unsloth.Q2_K.gguf

	PARAMETER stop <\|STOP\|>
	PARAMETER stop "<\|STOP\|>"
	PARAMETER stop <\|END_RESPONSE\|>
	PARAMETER stop "<\|END_RESPONSE\|>"
	PARAMETER temperature 0.4

	TEMPLATE """<\|BEGIN_QUERY\|>
	{{.Prompt}}
	<\|END_QUERY\|>
	<\|BEGIN_RESPONSE\|>
	"""

	SYSTEM """You are an AI assistant. Respond to the user's query between the BEGIN_QUERY and END_QUERY tokens. Use the appropriate BEGIN_ and END_ tokens for different types of content in your response.""""""
	```
	- Save a go back to the folder (folder where model + Modelfile exisit)
	- Now in terminal make sure you are in the same location of the folder and type in the following command

	```python
	ollama create mycustomai # "mycustomai" <- you can name it anything u want
	```

	After than you should be able to use this model to chat!
	This GGUF is based on unsloth/mistral-7b-instruct-v0.3-bnb-4bit by Unslot,


	# NOTE: DISCLAIMER

	Please note this is not for the purpose of production, but result of Fine Tuning through self learning
	This is my Fine Tuning pass through with personalized customized dataset.
	Please feel free to customize the Modelfile, and if you do get a better response than mine, please share!!

	If would like to know how I started creating my dataset, you can check this link
	[Crafting GPT2 for Personalized AI-Preparing Data the Long Way (Part1)](https://medium.com/@deeokay/the-soul-in-the-machine-crafting-gpt2-for-personalized-ai-9d38be3f635f)

	## The training data has the following Template:

	```python
	special_tokens_dict = {
	'eos_token': '<\|STOP\|>',
	'bos_token': '<\|STOP\|>',
	'pad_token': '<\|PAD\|>',
	'additional_special_tokens': ['<\|BEGIN_QUERY\|>', '<\|BEGIN_QUERY\|>',
	'<\|BEGIN_ANALYSIS\|>', '<\|END_ANALYSIS\|>',
	'<\|BEGIN_RESPONSE\|>', '<\|END_RESPONSE\|>',
	'<\|BEGIN_SENTIMENT\|>', '<\|END_SENTIMENT\|>',
	'<\|BEGIN_CLASSIFICATION\|>', '<\|END_CLASSIFICATION\|>',]
	}

	tokenizer.add_special_tokens(special_tokens_dict)
	model.resize_token_embeddings(len(tokenizer))

	tokenizer.eos_token_id = tokenizer.convert_tokens_to_ids('<\|STOP\|>')
	tokenizer.bos_token_id = tokenizer.convert_tokens_to_ids('<\|STOP\|>')
	tokenizer.pad_token_id = tokenizer.convert_tokens_to_ids('<\|PAD\|>')

	```

	## The data is in the following format:

	```python
	def combine_text(user_prompt, analysis, sentiment, new_response, classification):
	user_q = f"<\|STOP\|><\|BEGIN_QUERY\|>{user_prompt}<\|END_QUERY\|>"
	analysis = f"<\|BEGIN_ANALYSIS\|>{analysis}<\|END_ANALYSIS\|>"
	new_response = f"<\|BEGIN_RESPONSE\|>{new_response}<\|END_RESPONSE\|>"
	classification = f"<\|BEGIN_CLASSIFICATION\|>{classification}<\|END_CLASSIFICATION\|>"
	sentiment = f"<\|BEGIN_SENTIMENT\|>Sentiment: {sentiment}<\|END_SENTIMENT\|><\|STOP\|>"
	return user_q + analysis + new_response + classification + sentiment
	```