Support device_map="auto" when loading
without _no_split_modules
it will fail when setting device_map="auto"
https://github.com/huggingface/transformers/blob/118e9810687dd713b6be07af79e80eeb1d916908/src/transformers/modeling_utils.py#L2684-L2685
Just to clarify, does the model support device_map="auto" and if so what's the syntax?
Thank you for the PR @shijie-wu .
Based on my understanding, this should work, however, when I run this, I get an error:
NameError Traceback (most recent call last)
Cell In[5], line 1
----> 1 model = transformers.AutoModelForCausalLM.from_pretrained("mosaicml/mpt-7b", trust_remote_code=True, revision="refs/pr/23", device_map='auto')
File /usr/lib/python3/dist-packages/transformers/models/auto/auto_factory.py:462, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
458 class_ref = config.auto_map[cls.__name__]
459 model_class = get_class_from_dynamic_module(
460 class_ref, pretrained_model_name_or_path, **hub_kwargs, **kwargs
461 )
--> 462 return model_class.from_pretrained(
463 pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs
464 )
465 elif type(config) in cls._model_mapping.keys():
466 model_class = _get_model_class(config, cls._model_mapping)
File /usr/lib/python3/dist-packages/transformers/modeling_utils.py:2608, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
2606 init_contexts = [deepspeed.zero.Init(config_dict_or_path=deepspeed_config())] + init_contexts
2607 elif load_in_8bit or low_cpu_mem_usage:
-> 2608 init_contexts.append(init_empty_weights())
2610 with ContextManagers(init_contexts):
2611 model = cls(config, *model_args, **model_kwargs)
NameError: name 'init_empty_weights' is not defined
Did you test this?
@sam-mosaic You need to install accelerate (and reset your notebook)
Thank you @Forbu14
Does anyone know if passing refs/pr/23
as the revision
works as one would expect? Because when doing that, I now get
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[5], line 1
----> 1 model = transformers.AutoModelForCausalLM.from_pretrained("mosaicml/mpt-7b", trust_remote_code=True, revision="refs/pr/23", device_map="auto")
File /usr/lib/python3/dist-packages/transformers/models/auto/auto_factory.py:462, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
458 class_ref = config.auto_map[cls.__name__]
459 model_class = get_class_from_dynamic_module(
460 class_ref, pretrained_model_name_or_path, **hub_kwargs, **kwargs
461 )
--> 462 return model_class.from_pretrained(
463 pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs
464 )
465 elif type(config) in cls._model_mapping.keys():
466 model_class = _get_model_class(config, cls._model_mapping)
File /usr/lib/python3/dist-packages/transformers/modeling_utils.py:2685, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
2676 special_dtypes.update(
2677 {
2678 name: torch.float32
(...)
2681 }
2682 )
2684 if model._no_split_modules is None:
-> 2685 raise ValueError(f"{model.__class__.__name__} does not support `device_map='{device_map}'` yet.")
2686 no_split_modules = model._no_split_modules
2687 if device_map not in ["auto", "balanced", "balanced_low_0", "sequential"]:
ValueError: MPTForCausalLM does not support `device_map='auto'` yet.
Would love to support device_map="auto"
but need some evidence that this PR accomplishes that
I have the same error as you
@sam-mosaic
, it feels like the revision doesn't work (or we did something wrong)
we can try passing the hash instead d8a52ba8 ?
@sam-mosaic Basicly we download the right config.json file but the wrong *.py files :(
@Forbu14
confirmed that regardless of if I pass revision="refs/pr/23"
or revision="d8a52ba8"
, I do not get the version of modeling_mpt.py from this PR. That seems like a bug in transformers
to me.
@sam-mosaic clearly, I am trying out to figure what wrong in the transformers code base right now.
I did raise the issue on transformers github. : https://github.com/huggingface/transformers/issues/23745
@sam-mosaic Apparently the "revision" param is supported only for weight and not for code (currently)
Isn't the whole point of passing a revision to protect yourself from malicious code when using trust_remote_code=True
? It even warns you to use a revision!!
Looks like there is a version of mpt-7b that fixes this:
https://huggingface.co/cekal/mpt-7b-peft-compatible
Also to test a PR locally you can do this:
git clone https://huggingface.co/mosaicml/mpt-7b
pushd mpt-7b
git fetch origin refs/pr/23:pr/23
git checkout pr/23
popd
python your_script.py \
--model_name_or_path "./mpt-7b"
...
Works when using load_checkpoint_and_dispatch. Doesn't work with from_pretrained.
from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer, TextGenerationPipeline
import torch
from accelerate import init_empty_weights, load_checkpoint_and_dispatch
model_dir = './mpt-7b-instruct'
max_memory_mapping = {0: "16GB", 1: "16GB"}
config = AutoConfig.from_pretrained(
model_dir,
trust_remote_code=True,
load_in_8_bit=True
)
with init_empty_weights():
model = AutoModelForCausalLM.from_config(config, trust_remote_code=True)
model.tie_weights()
model = load_checkpoint_and_dispatch(
model, model_dir, device_map="auto", no_split_module_classes=["MPTBlock"], max_memory=max_memory_mapping
)
He kdua,
This is the error I am getting:
ValueError: checkpoint
should be the path to a file containing a whole state dict, or the index of a sharded
checkpoint, or a folder containing a sharded checkpoint, but got mosaicml/mpt-7b-instruct.
How to solve this?
Hi
@thechashi
You need to clone the model repository, checkout the relevant code change, pull the model files and then load the checkpoint from that directory:
git clone https://huggingface.co/mosaicml/mpt-7b-instruct
git lfs pull
git fetch origin refs/pr/23:pr/23
git checkout pr/23
Now use this cloned directory in model_dir
This should be supported now! We are doing some more tests to make sure multi-GPU inference works as well and should update soon.
sorry i was out of the loop but i'm glad that it's fixed by https://huggingface.co/mosaicml/mpt-7b/discussions/45.
@abhi-mosaic thank you for the support! have you already made progress with regards to improving multi-GPU inference? Currently, the prompt provided in the model card ("What is the capital of France?") takes 20 minutes with device_map=auto and continues to generate token after 'end_of_text'