Support device_map="auto" when loading

#23
by shijie-wu - opened

Just to clarify, does the model support device_map="auto" and if so what's the syntax?

Thank you for the PR @shijie-wu .

Based on my understanding, this should work, however, when I run this, I get an error:

NameError                                 Traceback (most recent call last)
Cell In[5], line 1
----> 1 model = transformers.AutoModelForCausalLM.from_pretrained("mosaicml/mpt-7b", trust_remote_code=True, revision="refs/pr/23", device_map='auto')

File /usr/lib/python3/dist-packages/transformers/models/auto/auto_factory.py:462, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    458     class_ref = config.auto_map[cls.__name__]
    459     model_class = get_class_from_dynamic_module(
    460         class_ref, pretrained_model_name_or_path, **hub_kwargs, **kwargs
    461     )
--> 462     return model_class.from_pretrained(
    463         pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs
    464     )
    465 elif type(config) in cls._model_mapping.keys():
    466     model_class = _get_model_class(config, cls._model_mapping)

File /usr/lib/python3/dist-packages/transformers/modeling_utils.py:2608, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
   2606     init_contexts = [deepspeed.zero.Init(config_dict_or_path=deepspeed_config())] + init_contexts
   2607 elif load_in_8bit or low_cpu_mem_usage:
-> 2608     init_contexts.append(init_empty_weights())
   2610 with ContextManagers(init_contexts):
   2611     model = cls(config, *model_args, **model_kwargs)

NameError: name 'init_empty_weights' is not defined

Did you test this?

@sam-mosaic You need to install accelerate (and reset your notebook)

Thank you @Forbu14

Does anyone know if passing refs/pr/23 as the revision works as one would expect? Because when doing that, I now get

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[5], line 1
----> 1 model = transformers.AutoModelForCausalLM.from_pretrained("mosaicml/mpt-7b", trust_remote_code=True, revision="refs/pr/23", device_map="auto")

File /usr/lib/python3/dist-packages/transformers/models/auto/auto_factory.py:462, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    458     class_ref = config.auto_map[cls.__name__]
    459     model_class = get_class_from_dynamic_module(
    460         class_ref, pretrained_model_name_or_path, **hub_kwargs, **kwargs
    461     )
--> 462     return model_class.from_pretrained(
    463         pretrained_model_name_or_path, *model_args, config=config, **hub_kwargs, **kwargs
    464     )
    465 elif type(config) in cls._model_mapping.keys():
    466     model_class = _get_model_class(config, cls._model_mapping)

File /usr/lib/python3/dist-packages/transformers/modeling_utils.py:2685, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
   2676 special_dtypes.update(
   2677     {
   2678         name: torch.float32
   (...)
   2681     }
   2682 )
   2684 if model._no_split_modules is None:
-> 2685     raise ValueError(f"{model.__class__.__name__} does not support `device_map='{device_map}'` yet.")
   2686 no_split_modules = model._no_split_modules
   2687 if device_map not in ["auto", "balanced", "balanced_low_0", "sequential"]:

ValueError: MPTForCausalLM does not support `device_map='auto'` yet.

Would love to support device_map="auto" but need some evidence that this PR accomplishes that

I have the same error as you @sam-mosaic , it feels like the revision doesn't work (or we did something wrong)
we can try passing the hash instead d8a52ba8 ?

Ok I did some checking and it seems that we just download the wrong version of the files here :

Screenshot 2023-05-24 at 22.18.43.png

Screenshot 2023-05-24 at 22.20.05.png

@sam-mosaic Basicly we download the right config.json file but the wrong *.py files :(

@Forbu14 confirmed that regardless of if I pass revision="refs/pr/23" or revision="d8a52ba8", I do not get the version of modeling_mpt.py from this PR. That seems like a bug in transformers to me.

@sam-mosaic clearly, I am trying out to figure what wrong in the transformers code base right now.

I did raise the issue on transformers github. : https://github.com/huggingface/transformers/issues/23745

@sam-mosaic Apparently the "revision" param is supported only for weight and not for code (currently)

Isn't the whole point of passing a revision to protect yourself from malicious code when using trust_remote_code=True? It even warns you to use a revision!!

Looks like there is a version of mpt-7b that fixes this:

https://huggingface.co/cekal/mpt-7b-peft-compatible

Also to test a PR locally you can do this:

git clone https://huggingface.co/mosaicml/mpt-7b
pushd mpt-7b
git fetch origin refs/pr/23:pr/23
git checkout pr/23
popd

python your_script.py \
    --model_name_or_path "./mpt-7b"
...

Works when using load_checkpoint_and_dispatch. Doesn't work with from_pretrained.

from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer, TextGenerationPipeline
import torch
from accelerate import init_empty_weights, load_checkpoint_and_dispatch

model_dir = './mpt-7b-instruct'

max_memory_mapping = {0: "16GB", 1: "16GB"}

config = AutoConfig.from_pretrained(
model_dir,
trust_remote_code=True,
load_in_8_bit=True
)

with init_empty_weights():
model = AutoModelForCausalLM.from_config(config, trust_remote_code=True)

model.tie_weights()

model = load_checkpoint_and_dispatch(
model, model_dir, device_map="auto", no_split_module_classes=["MPTBlock"], max_memory=max_memory_mapping
)

He kdua,

This is the error I am getting:

ValueError: checkpoint should be the path to a file containing a whole state dict, or the index of a sharded
checkpoint, or a folder containing a sharded checkpoint, but got mosaicml/mpt-7b-instruct.

How to solve this?

Hi @thechashi
You need to clone the model repository, checkout the relevant code change, pull the model files and then load the checkpoint from that directory:

git clone https://huggingface.co/mosaicml/mpt-7b-instruct
git lfs pull
git fetch origin refs/pr/23:pr/23
git checkout pr/23

Now use this cloned directory in model_dir

This should be supported now! We are doing some more tests to make sure multi-GPU inference works as well and should update soon.

abhi-mosaic changed pull request status to closed

sorry i was out of the loop but i'm glad that it's fixed by https://huggingface.co/mosaicml/mpt-7b/discussions/45.

@abhi-mosaic thank you for the support! have you already made progress with regards to improving multi-GPU inference? Currently, the prompt provided in the model card ("What is the capital of France?") takes 20 minutes with device_map=auto and continues to generate token after 'end_of_text'

Sign up or log in to comment