33x Coding Model
33x-coder is a powerful Llama based model available on Hugging Face, designed to assist and augment coding tasks. Leveraging the capabilities of advanced language models, 33x-coder specializes in understanding and generating code. This model is trained on a diverse range of programming languages and coding scenarios, making it a versatile tool for developers looking to streamline their coding process. Whether you're debugging, seeking coding advice, or generating entire scripts, 33x-coder can provide relevant, syntactically correct code snippets and comprehensive programming guidance. Its intuitive understanding of coding languages and constructs makes it an invaluable asset for any coding project, helping to reduce development time and improve code quality.
Importing necessary libraries from transformers
from transformers import AutoTokenizer, AutoModelForCausalLM
Initialize the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("senseable/33x-coder")
model = AutoModelForCausalLM.from_pretrained("senseable/33x-coder").cuda()
User's request for a quick sort algorithm in Python
messages = [
{'role': 'user', 'content': "Write a Python function to check if a number is prime."}
]
Preparing the input for the model by encoding the messages and sending them to the same device as the model
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
Generating responses from the model with specific parameters for text generation
outputs = model.generate(
inputs,
max_new_tokens=512, # Maximum number of new tokens to generate
do_sample=False, # Disable random sampling to get the most likely next token
top_k=50, # The number of highest probability vocabulary tokens to keep for top-k-filtering
top_p=0.95, # Nucleus sampling: keeps the top p probability mass worth of tokens
num_return_sequences=1, # The number of independently computed returned sequences for each element in the batch
eos_token_id=32021, # End of sequence token id
add_generation_prompt=True
)
Decoding and printing the generated response
start_index = len(inputs[0])
generated_output_tokens = outputs[0][start_index:]
decoded_output = tokenizer.decode(generated_output_tokens, skip_special_tokens=True)
print("Generated Code:\n", decoded_output)
license: apache-2.0
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 29.95 |
AI2 Reasoning Challenge (25-Shot) | 26.19 |
HellaSwag (10-Shot) | 26.44 |
MMLU (5-Shot) | 24.93 |
TruthfulQA (0-shot) | 51.14 |
Winogrande (5-shot) | 50.99 |
GSM8k (5-shot) | 0.00 |
- Downloads last month
- 1,170
Datasets used to train senseable/moe-x33
Evaluation results
- normalized accuracy on AI2 Reasoning Challenge (25-Shot)test set Open LLM Leaderboard26.190
- normalized accuracy on HellaSwag (10-Shot)validation set Open LLM Leaderboard26.440
- accuracy on MMLU (5-Shot)test set Open LLM Leaderboard24.930
- mc2 on TruthfulQA (0-shot)validation set Open LLM Leaderboard51.140
- accuracy on Winogrande (5-shot)validation set Open LLM Leaderboard50.990
- accuracy on GSM8k (5-shot)test set Open LLM Leaderboard0.000