--- base_model: SicariusSicariiStuff/Impish_LLAMA_3B datasets: - KingNish/reasoning-base-20k - piotr25691/thea-name-overrides language: - en license: llama3.2 tags: - text-generation-inference - transformers - llama - trl - sft - reasoning - llama-3 --- # Model Description An uncensored roleplay reasoning Llama 3.2 3B model trained on reasoning data. It has been trained using improved training code, and gives an improved performance. Here is what inference code you should use: ```py from transformers import AutoModelForCausalLM, AutoTokenizer MAX_REASONING_TOKENS = 1024 MAX_RESPONSE_TOKENS = 512 model_name = "piotr25691/thea-rp-3b-25r" model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto") tokenizer = AutoTokenizer.from_pretrained(model_name) prompt = "Which is greater 9.9 or 9.11 ??" messages = [ {"role": "user", "content": prompt} ] # Generate reasoning reasoning_template = tokenizer.apply_chat_template(messages, tokenize=False, add_reasoning_prompt=True) reasoning_inputs = tokenizer(reasoning_template, return_tensors="pt").to(model.device) reasoning_ids = model.generate(**reasoning_inputs, max_new_tokens=MAX_REASONING_TOKENS) reasoning_output = tokenizer.decode(reasoning_ids[0, reasoning_inputs.input_ids.shape[1]:], skip_special_tokens=True) print("REASONING: " + reasoning_output) # Generate answer messages.append({"role": "reasoning", "content": reasoning_output}) response_template = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) response_inputs = tokenizer(response_template, return_tensors="pt").to(model.device) response_ids = model.generate(**response_inputs, max_new_tokens=MAX_RESPONSE_TOKENS) response_output = tokenizer.decode(response_ids[0, response_inputs.input_ids.shape[1]:], skip_special_tokens=True) print("ANSWER: " + response_output) ``` - **Trained by:** [Piotr Zalewski](https://huggingface.co/piotr25691) - **License:** llama3.2 - **Finetuned from model:** [SicariusSicariiStuff/Impish_LLAMA_3B](https://huggingface.co/SicariusSicariiStuff/Impish_LLAMA_3B) - **Dataset used:** [KingNish/reasoning-base-20k](https://huggingface.co/datasets/KingNish/reasoning-base-20k) This Llama model was trained faster than [Unsloth](https://github.com/unslothai/unsloth) using [custom training code](https://www.kaggle.com/code/piotr25691/distributed-llama-training-with-2xt4). Visit https://www.kaggle.com/code/piotr25691/distributed-llama-training-with-2xt4 to find out how you can finetune your models using BOTH of the Kaggle provided GPUs.