What is this?

by TheBloke - opened Jul 13, 2023

Discussion

TheBloke

Jul 13, 2023

Is this a new instruction fine tuned model? If so could you provide some info on what it was trained on?

Thanks in advance

Limerobot

upstage org Jul 19, 2023

@TheBloke
Apologies for the delay. We have recently updated the model card about this model. Please refer to it for more information.
Thank you

kagevazquez

Jul 19, 2023

Your "contact us" should be higher up. Great work!

TheBloke

Jul 19, 2023

Wow yeah this looks really interesting. I will do quantisations of it now, so more people can run it and learn about it

Now that Llama 2 is out, are you planning to bring out a llama-2-13b-instruct, and/or maybe llama-2-70b-instruct? It's a shame there's no Llama 2 34B yet but apparently it's coming fairly soon.

TheBloke

Jul 19, 2023

By the way I suggest you put your full model card in all the variants. The 30B 2048 is definitely the most interesting I think, but it only has a very short model card where the user has to click elsewhere to learn what this is. I would copy the full model card to each model, with a brief line explaining what is different about each particular one. Less work for the user = more interest!

nxnhjrjtbjfzhrovwl

Jul 20, 2023

•

edited Jul 20, 2023

invading this discussion a bit, i would like to know if we will ever get a 65B 2048, after all it's clear that 30B 2048 got much better results than 30B 1024 so probably 65B would follow this trend.

Limerobot

upstage org Jul 24, 2023

@TheBloke Thank you for your interest in our model. Taking into account the number of GPUs available to us, we're planning to fine-tune the Llama2 model. We'll soon release the Llama2-70b model which has been trained with 200k data. We appreciate your valuable suggestions. :)

@nxnhjrjtbjfzhrovwl Given that the Llama2-70b model is better than the 65b, we're planning to fine-tune the Llama2-70b-2048 model first.

Limerobot changed discussion status to closed Jul 24, 2023

TheBloke

Jul 24, 2023

Great to hear!

Ideally you would do Llama2-70B-4096? Given it has increased context.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment