BearSean's picture
Update README.md
adae237 verified
metadata
language:
  - en
pipeline_tag: text-generation
tags:
  - upstage
  - solar
  - pytorch
extra_gated_prompt: |-
  Terms and Conditions

    1. You shall not redistribute the original pre-trained model.

    2. You are granted permission to use this model for your own fine-tuning purposes.

    3. You may open-source the resulting fine-tuned model with any license, including for commercial use.
extra_gated_fields:
  First Name: text
  Last Name: text
  Country: country
  Organization/Company: text
  geo: ip_location
  By clicking Submit below I accept the terms of the license and acknowledge that the information I provide will be collected stored processed and shared in accordance with the Upstage Privacy Policy: checkbox
extra_gated_button_content: Submit

solar-pro-preview-pretrained

solar-pro-preview-pretrained is a pre-trained model made by Upstage.

Usage

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("upstage/solar-pro-preview-pretrained")
model = AutoModelForCausalLM.from_pretrained(
    "upstage/solar-pro-preview-pretrained",
    device_map="cuda:0",
    torch_dtype='auto',
    trust_remote_code=True,
)

Fine-tuning

If you want to use it for chat purpose, please fine-tune it first. Please refer to the following chat template when fine-tuning.

# Generating Text for Multi-Turn Interaction
# For multi-turn conversations, use this approach:

context = [
    {"role": "system", "content": "You are Solar, an AI bot by Upstage, loved by many people."},
    {"role": "user", "content": "Hi, there!"},
    {"role": "assistant", "content": "Hello, how can I help you?"},
    {"role": "user", "content": "Send me a message of support."},
]

prompt = tokenizer.apply_chat_template(context, tokenize=False, add_generation_prompt=True)

print("# Input")
print(prompt)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, use_cache=True, max_new_tokens=4096)

print("# Output")
print(tokenizer.decode(outputs[0, inputs["input_ids"].shape[-1]:]))