OpenAssistant/stablelm-7b-sft-v7-epoch-3 · The model weights are not tied. Please use the `tie_weights` method before using the `infer_auto

I have the same issue (not Jupiter notebook in my case) although I have followed advices found in forums and youtube videos, but no success yet :(

I share my testing code, launched from Console Prompt (Windows 11 Enterprise with 16GB RAM), and error details, in case anyone could help here.

Code:

import langchain
from langchain import HuggingFacePipeline
from langchain import PromptTemplate
from langchain import LLMChain
from langchain.document_loaders import OnlinePDFLoader
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "OpenAssistant/stablelm-7b-sft-v7-epoch-3"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name, device_map="auto", offload_folder="offload", torch_dtype=torch.float16
)
model.tie_weights()

llm = HuggingFacePipeline.from_model_id(model_id = model_name,
task = "text-generation", model_kwargs = {
"temperature" : 0.0, "max_length" : 2048, "device_map" : "auto"})

loader = OnlinePDFLoader("https://arxiv.org/pdf/1911.01547.pdf")
document = loader.load()

template = """<|prompter|>{question}<|endoftext|><|assistant|>"""
prompt = PromptTemplate(template=template, input_variables=["question"])
llm_chain = LLMChain(prompt=prompt, llm=llm)
question = "What is the meaning of life?"
llm_chain.run(question)

Output:
The model weights are not tied. Please use the tie_weights method before using the infer_auto_device function.
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:50<00:00, 5.66s/it]
The model weights are not tied. Please use the tie_weights method before using the infer_auto_device function.

20 │
│ 21 │
│ 22 │
│ ❱ 23 llm = HuggingFacePipeline.from_model_id(model_id = model_name, │
│ 24 │ │ │ │ │ │ │ │ │ │ │ │ task = "text-generation", model_kwargs │
│ 25 │ │ │ │ │ │ │ │ │ │ │ │ "temperature" : 0.0, "max_length" : 204 │
│ 26 │
│ │
│ c:\Program Files\Python38\lib\site-packages\langchain\llms\huggingface_pipeline.py:92 in │
│ from_model_id │
│ │
│ 89 │ │ │
│ 90 │ │ try: │
│ 91 │ │ │ if task == "text-generation": │
│ ❱ 92 │ │ │ │ model = AutoModelForCausalLM.from_pretrained(model_id, **_model_kwargs) │
│ 93 │ │ │ elif task in ("text2text-generation", "summarization"): │
│ 94 │ │ │ │ model = AutoModelForSeq2SeqLM.from_pretrained(model_id, **_model_kwargs) │
│ 95 │ │ │ else: │
│ │
│ c:\Program Files\Python38\lib\site-packages\transformers\models\auto\auto_factory.py:467 in │
│ from_pretrained │
│ │
│ 464 │ │ │ ) │
│ 465 │ │ elif type(config) in cls._model_mapping.keys(): │
│ 466 │ │ │ model_class = _get_model_class(config, cls._model_mapping) │
│ ❱ 467 │ │ │ return model_class.from_pretrained( │
│ 468 │ │ │ │ pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, │
│ 469 │ │ │ ) │
│ 470 │ │ raise ValueError( │
│ │
│ c:\Program Files\Python38\lib\site-packages\transformers\modeling_utils.py:2777 in │
│ from_pretrained │
│ │
│ 2774 │ │ │ │ mismatched_keys, │
│ 2775 │ │ │ │ offload_index, │
│ 2776 │ │ │ │ error_msgs, │
│ ❱ 2777 │ │ │ ) = cls._load_pretrained_model( │
│ 2778 │ │ │ │ model, │
│ 2779 │ │ │ │ state_dict, │
│ 2780 │ │ │ │ loaded_state_dict_keys, # XXX: rename? │
│ │
│ c:\Program Files\Python38\lib\site-packages\transformers\modeling_utils.py:2871 in │
│ _load_pretrained_model │
│ │
│ 2868 │ │ │ ) │
│ 2869 │ │ │ is_safetensors = archive_file.endswith(".safetensors") │
│ 2870 │ │ │ if offload_folder is None and not is_safetensors: │
│ ❱ 2871 │ │ │ │ raise ValueError( │
│ 2872 │ │ │ │ │ "The current device_map had weights offloaded to the disk. Please │
│ 2873 │ │ │ │ │ " for them. Alternatively, make sure you have safetensors installe │
│ 2874 │ │ │ │ │ " offers the weights in this format." │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
ValueError: The current device_map had weights offloaded to the disk. Please provide an offload_folder for them. Alternatively, make sure you have safetensors installed if the model you are using offers
the weights in this format.

Thanks a lot in advance!!

OpenAssistant
/

stablelm-7b-sft-v7-epoch-3

The model weights are not tied. Please use the `tie_weights` method before using the `infer_auto_device` function.