mradermacher/model_requests · Llama 3.1 models

joaquinito2073

Jul 23

Llama 3.1 models:
https://huggingface.co/meta-llama/Meta-Llama-3.1-405B-Instruct
https://huggingface.co/meta-llama/Meta-Llama-3.1-405B
https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct
https://huggingface.co/meta-llama/Meta-Llama-3.1-70B
https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct
https://huggingface.co/meta-llama/Meta-Llama-3.1-8B

mradermacher

Owner Jul 23

Unfortunately, they are gated, and I cannot access them (they require a facebook account last I tried). If somebody makes an accessible copy, I am, of course, all game.

mradermacher

Owner Jul 23

@nicoboss surely you have a facebook account and are willing to clone/backup them? :)

nicoboss

Jul 23

@Guilherme34 was already granted access and is currently downloading them all into an LXC container running on the same machine as your LXC container. He will start with 8B than 70B and finally 405B. As soon the first models are downloaded, I will read-only mount the folder containing the models into to your LXC container.

Green-Sky

Jul 23

Last time i checked you dont actually need a fb account, you "just" have to fill a form with your personal info. Its even available integrated into hf.

nicoboss

Jul 23

•

edited Jul 23

@mradermacher The 8B models and 70B base model are already donwloaded and mounted to /Guilherme34/root/.cache/huggingface/hub inside your LXC container. For a models as important as this ones please ignoring any daytime restrictions and quantize them as soon as possible.

The following models are aready downloaded:

Meta-Llama-3.1-8B
Meta-Llama-3-8B-Instruct
Meta-Llama-3.1-70B

Others are still in progress.

nicoboss

Jul 23

The download of Meta-Llama-3.1-70B-Instruct is now compleated as well.
@mradermacher Sorry we first forgot to download the tokenizer but added it now to all the 8B and 70B models.

In case you are confused about the huggingface cache structure the location of the normal safetensor models is the following:

Meta-Llama-3.1-8B: /Guilherme34/root/.cache/huggingface/hub/models--meta-llama--Meta-Llama-3.1-8B/snapshots/13f04ed6f85ef2aa2fd11b960a275c3e31a8069e
Meta-Llama-3-8B-Instruct: /Guilherme34/root/.cache/huggingface/hub/models--meta-llama--Meta-Llama-3.1-405B-Instruct/snapshots/c7c9648767719216fd9a80097da3a57b72748028
Meta-Llama-3.1-70B: /Guilherme34/root/.cache/huggingface/hub/models--meta-llama--Meta-Llama-3.1-70B/snapshots/6113f060e6c497c714cc6463c8bcbd78aefac089
Meta-Llama-3.1-70B-Instruct: /Guilherme34/root/.cache/huggingface/hub/models--meta-llama--Meta-Llama-3.1-70B-Instruct/snapshots/25acb1b514688b222a02a89c6976a8d7ad0e017

mradermacher

Owner Jul 23

Ah... a local download is not actually that helpful (I can't really get it out at reasonable speeds), I'd need a repo clone. Maybe it's possible to only check in the lfs files (sounds like a security bug in huggingface though, if it's possible).

mradermacher

Owner Jul 23

@Green-Sky last time I filled out the form and then needed a facebook account. i.e. they lied to me. My trust is eroded.

nicoboss

Jul 23

•

edited Jul 23

Ah... a local download is not actually that helpful (I can't really get it out at reasonable speeds), I'd need a repo clone. Maybe it's possible to only check in the lfs files (sounds like a security bug in huggingface though, if it's possible).

@Guilherme34 Will give you a token to access them. I will email it to you shortly. In the meantime likely copying over the 8B models over 100 Mbit/s should not take that long.

mradermacher

Owner Jul 23

•

edited Jul 23

The 8b will not take very long, but since it is completely unnecessary, wouldn't the time be better invested in getting a clone of all repos, which is pretty much instant? Just asking, no criticism :)

nicoboss

Jul 23

@mradermacher I sent you a mail with the Llama 3.1 access token and a code excample how to use the access token to download it.

mradermacher

Owner Jul 23

Anyways, thanks to @Guilherme34 I can download the models manually then.

nicoboss

Jul 23

•

edited Jul 23

The 8b will not take very long, but since it is completely unnecessary, wouldn't the time be better invested in getting a clone of all repos, which is pretty much instant? Just asking, no criticism :)

I fully agree. Just use the access token I sent you to download all of them. Should be super fast. And sorry in model = AutoModelForCausalLM.from_pretrained(base_model_id, token=access_token) you obviously need to insert the access token as well. Also sorry that my email formating got slightely messed up again.

deleted

Jul 23

This comment has been hidden

mradermacher

Owner Jul 23

•

edited Jul 23

Tried to figure out git access to clone the repos, but failed, so no public clones, but the models should be slowly coming now. Thanks again for everybody involved :)

mradermacher changed discussion status to closed Jul 23

mradermacher

Owner Jul 23

@Guilherme34 would it be possible to get access to https://huggingface.co/meta-llama/Llama-Guard-3-8B and https://huggingface.co/meta-llama/Prompt-Guard-86M too, just for completeness, assuming they are under the same conditions.

nicoboss

Jul 23

•

edited Jul 23

@Guilherme34 would it be possible to get access to https://huggingface.co/meta-llama/Llama-Guard-3-8B and https://huggingface.co/meta-llama/Prompt-Guard-86M too, just for completeness, assuming they are under the same conditions.

@mradermacher Your access token should now also be able to download the Llama-Guard-3-8B and Prompt-Guard-86M models as @Guilherme34 requested and was granted access to them.

mradermacher

Owner Jul 23

Won-der-ful! I was waiting for the guard models for quite a whole :)

mradermacher

Owner Jul 23

•

edited Jul 23

Thanks, meta, for checking in two pickle versions of your 405B models, too (for a teensly-tiny 5TB download. ugh).

Anyways, thanks again to everybody here - it was a pleasure to see a bunch of people work together this quickly :) I'm so eager to find out whether llama.cpp can handle the 405B model or not.

Green-Sky

Jul 24

@mradermacher the models need to be remade after recent code changes in transformer lib https://github.com/ggerganov/llama.cpp/issues/8650#issuecomment-2247595544

From what ppl are saying, it seems like the rope scaling (beyond 8k tokens) is still broken too.

mradermacher

Owner Jul 24

•

edited Jul 24

@Green-Sky thanks - you don't happen to know which transformers release (if any) fixes this (the pretokenizer)? As for rope scaling, it's probably prudent to wait for problems to be sorted out. No need to be the first :)

mradermacher changed discussion status to open Jul 24

mradermacher

Owner Jul 24

https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/commit/339ce92d052f002cdbac4a4bd551d1c61dd8345e - was this a change to the llama-3.1 model repos?

nicoboss

Jul 24

•

edited Jul 24

https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct/commit/339ce92d052f002cdbac4a4bd551d1c61dd8345e - was this a change to the llama-3.1 model repos?

You can use the access token from yesterday to git clone the repository and localy see the change. Execute git clone https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct or GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct and use Guilherme34 as username and the Token as password.

Guilherme34

Jul 24

@nicoboss can you answer my message in discord?

mradermacher

Owner Jul 24

I did try this originally when trying to clone the repos and got a permission denied, but maybe I mistyped.

mradermacher

Owner Jul 24

wait, that commit was for llama 3, not llama 3.1. is the same fix needed for llama 3 too?

mradermacher

Owner Jul 25

note to self: waiting for https://github.com/ggerganov/llama.cpp/pull/8676

aifeifei798

Jul 27

The new model has a bunch of issues, so I'll wait for the next iteration. I'll hold off on taking action for 15 days. :~~~

yttria

Jul 27

This comment has been hidden

nicoboss

Jul 27

llama : add support for llama 3.1 rope scaling factors (#8676) fix released in b3472 3 minutes ago.

mradermacher

Owner Jul 27

Finally (re-)queued everything. I converted a few other models first to see how it comes out, and it seems to work. Expect a busy night. Everything quanted in the last 10 hours or so should have the fixes.

mradermacher changed discussion status to closed Jul 27