ValueError: the following model_kwargs are not used by model
I am running the code with the transformers repo that was recommended in the llama model repos:
git clone https://github.com/huggingface/transformers.git
cd transformers
git checkout d04ec99bec8a0b432fc03ed60cea9a1a20ebaf3c
pip install .
However, I get an error when trying to run:
from transformers import AutoTokenizer, AutoModelForCausalLM, TextStreamer
tokenizer = AutoTokenizer.from_pretrained("falcon-40b-sft-mix-1226", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("falcon-40b-sft-mix-1226", device_map="sequential", offload_folder="offload", load_in_8bit=True, trust_remote_code=True)
streamer = TextStreamer(tokenizer, skip_prompt=True)
message = "<|prompter|>This is a demo of a text streamer. What's a cool fact about ducks?<|endoftext|><|assistant|>"
inputs = tokenizer(message, return_tensors="pt").to(model.device)
tokens = model.generate(**inputs, max_new_tokens=25, do_sample=True, temperature=0.9, streamer=streamer)
returns this error:
dev_1/lib/python3.10/site-packages/transformers/generation/utils.py:1250: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation)
warnings.warn(
โญโโโโโโโโโโโโโโโโโโโโโโโโ Traceback (most recent call last) โโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ <stdin>:1 in <module> โ
โ โ
โ dev_1/lib/python3.10/site-p โ
โ ackages/torch/utils/_contextlib.py:115 in decorate_context โ
โ โ
โ 112 โ @functools.wraps(func) โ
โ 113 โ def decorate_context(*args, **kwargs): โ
โ 114 โ โ with ctx_factory(): โ
โ โฑ 115 โ โ โ return func(*args, **kwargs) โ
โ 116 โ โ
โ 117 โ return decorate_context โ
โ 118 โ
โ โ
โdev_1/lib/python3.10/site-p โ
โ ackages/transformers/generation/utils.py:1262 in generate โ
โ โ
โ 1259 โ โ generation_config = copy.deepcopy(generation_config) โ
โ 1260 โ โ model_kwargs = generation_config.update(**kwargs) # All unused kw โ
โ 1261 โ โ generation_config.validate() โ
โ โฑ 1262 โ โ self._validate_model_kwargs(model_kwargs.copy()) โ
โ 1263 โ โ โ
โ 1264 โ โ # 2. Set generation parameters if not already defined โ
โ 1265 โ โ logits_processor = logits_processor if logits_processor is not Non โ
โ โ
โ dev_1/lib/python3.10/site-p โ
โ ackages/transformers/generation/utils.py:1135 in _validate_model_kwargs โ
โ โ
โ 1132 โ โ โ โ unused_model_args.append(key) โ
โ 1133 โ โ โ
โ 1134 โ โ if unused_model_args: โ
โ โฑ 1135 โ โ โ raise ValueError( โ
โ 1136 โ โ โ โ f"The following `model_kwargs` are not used by the model: โ
โ 1137 โ โ โ โ " generate arguments will also show up in this list)" โ
โ 1138 โ โ โ ) โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
ValueError: The following `model_kwargs` are not used by the model:
['token_type_ids'] (note: typos in the generate arguments will also show up in this
list)
Lol, ok the quick fix:
open transformers/generation/utils.py
and comment out the if statement on line 1134-1138, like so:
# if unused_model_args:
# raise ValueError(
# f"The following `model_kwargs` are not used by the model: {unused_model_args} (note: typos in the"
# " generate arguments will also show up in this list)"
# )
If it looks stupid, but it works...
New output (as expected for 25 max_new_tokens
):
dev_1/lib/python3.10/site-packages/transformers/generation/utils.py:1250: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation)
warnings.warn(
Setting `pad_token_id` to `eos_token_id`:11 for open-end generation.
Ducks have a waterproof coating on their feathers, which allows them to swim and preen themselves in water.
Their web
update: huggingface indicated they will not fix this in the transformers repo, so this is now a mandatory step.
simpler solution, replace **inputs
with input_ids=inputs['input_ids'], attention_mask=inputs['attention_mask']
the extra arg is returned by the tokenizer.
simpler solution, replace
**inputs
withinput_ids=inputs['input_ids'], attention_mask=inputs['attention_mask']
the extra arg is returned by the tokenizer.
yooo, that's way easier, thanks :D