Model Card for Model ID
Model Name
Luxeai-anu-1-bit-70M
Model Description
The Luxeai-anu-1-bit-70M Large Language Model (LLM) is my first trial to implement one-bit LLM based on the original paper - "The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits". I have taken the pre-trained Mistral-7B-v0.3 and abideen/Cosmopedia-100k-pretrain dataset. I used Microsoft Azure Standard_NC6s_v3 6 cores, 112GB RAM, 736GB storage 1 x NVIDIA Tesla V100 to train this initial model. I will be training on a much bigger dataset once I get a sponshorship for a 8x DGX System. I have tested on a sub-set of the same dataset.
Intended Use
- Task: text generation
How to Use
Please follow the below code to run and test it in Python Jupyter Notebook
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from transformers.models.llama.modeling_llama import *
# Load the model
model = "arunb74/Luxeai-anu-1-bit-70M"
tokenizer = AutoTokenizer.from_pretrained(model)
model = AutoModelForCausalLM.from_pretrained(model)
# Create a text generation pipeline
pipe = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
device_map="auto"
)
prompt = "The LISA Pathfinder scientific collaboration will meet in Trento"
sequences = pipe(
f"<s>[INST] {prompt} [/INST]",
do_sample=True,
max_new_tokens=100,
temperature=0.7,
top_k=50,
top_p=0.95,
num_return_sequences=1,
)
print(sequences[0]['generated_text'])
"""
The output will be as follows - <s>[INST] The LISA Pathfinder scientific collaboration will meet in Trento [/INST]
The LISA Pathfinder Biology, a leading provider of biochemistry and molecular biology, provides a comprehensive understanding of the mechanisms and mechanisms of the LISA pathways. The LISA Pathfinder Biology, a researcher specializing in molecular biology, is a clinical trial of the disease, and its pathophysiology, and a combination of the most commonly used and widely used treatments. It is a relatively simple procedure that involves two steps.
# I need community members to help me further for feedback, suitable dataset for further training, testing, evaluation.
"""
- Downloads last month
- 176
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.