Base model not released

by adamo1139 - opened 27 days ago

27 days ago

•

Hi Rhymes Team,

Thank you for releasing a model with a permissive license. This model has the potential to disrupt workflows in many use cases after fine-tuning. However, the base model has not been released, which will likely make fine-tuning for downstream tasks more challenging for developers. Could you please release the weights of the pre-trained model before it was subjected to multimodal post-training data?

gopi87

27 days ago

yep cool model a little bit fine tune will make this model near to gpt o level perfomance !!

@MaziyarPanahi

MaziyarPanahi

27 days ago

This is a lovely model! Never done RLHF on a multimodal model, but there is always a first! :)

JunnanLi

Rhymes.AI org 27 days ago

Thanks for your feedback!

We found that our post-training does not hinder performance on fine-tuning for downstream tasks.

Araki

26 days ago

Please consider releasing the base model. It's not about the benchmark results. For things outside the box that are not designed to work in question/answer pairs, an instruct-tuned model cannot and should not be used, as it will by design always have the assistant-like bias.

An Apache 2.0 licensed base model that is both competitive and has only ~4B active parameters would be very nice.

Delta36652

26 days ago

I support this initiative. Base model will be valuable on its own.

muratowski

26 days ago

in Rhymes.ai website, when I ask, which model it is, it replies: GPT-4

Icecream102

24 days ago

nina-summer

Rhymes.AI org 24 days ago

@Icecream102 Due to the more recent knowledge cutoff and the use of some open-source synthetic data during instruction fine-tuning, Aria occasionally experiences confusion in its self-identity.

Icecream102

22 days ago

So it's not Reflection 70B all over again? Assuming this is not the case, the only way a model would claim being GPT-4 (other than simply instructed to, which is irrelevant) is that the training data makes it believe so. Now, I can fully see this happening in several ways, ranging from benign to problematic. GPT-4 being such a dominant entity being discussed extensively online as well as in books, news, scientific papers, benchmarks, etc would allow for many weak signals about self-identity as an LLM to add up to hallucinating about being GPT-4. However, to minimize the risk for trouble, please dig thoroughly and prune out and/or expose anything you can find in whatever public open-source datasets you are referring to. The community would benefit from weeding out anything than strengthens this effect, since I would be easy enough for lawyers to "jump to conclusions", to put it mildly. Please help the community keep any open-source data of consequence clean from this type of contamination, even if the data is made open-source by some 3rd party. /Gabriel

Icecream102

22 days ago

Also - thank you for your generosity in making this model open-source! (base-model would be great as well! ;-) )

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment