Nemo-Mistral-Minitron

Running on Zero

File size: 3,176 Bytes

ade11b4
 
 
 
 
4299336
e10040f
4299336
ade11b4
e10040f
ade11b4
e10040f
4299336
e10040f
ad72fd3
e10040f
ade11b4
e10040f
ade11b4
ad72fd3
e10040f
ad72fd3
e10040f
 
 
ade11b4
e10040f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18d32fd

joinus = """
## Join us :
🌟TeamTonic🌟 is always making cool demos! Join our active builder's 🛠️community 👻 [![Join us on Discord](https://img.shields.io/discord/1109943800132010065?label=Discord&logo=discord&style=flat-square)](https://discord.gg/qdfnvSPcqP) On 🤗Huggingface:[MultiTransformer](https://huggingface.co/MultiTransformer) On 🌐Github: [Tonic-AI](https://github.com/tonic-ai) & contribute to🌟 [Build Tonic](https://git.tonic-ai.com/contribute)🤗Big thanks to Yuvi Sharma and all the folks at huggingface for the community grant 🤗
"""

title =  """# 🙋🏻‍♂️Welcome to Tonic's 🤖 Mistral-NeMo-Minitron Demo 🚀"""

description = """nvidia/🤖Mistral-NeMo-Minitron-8B-Instruct is a model for generating responses for various text-generation tasks including roleplaying, retrieval augmented generation, and function calling.
"""

presentation1 = """Try this model on [build.nvidia.com](https://build.nvidia.com/nvidia/nemotron-mini-4b-instruct).

Mistral-NeMo-Minitron-8B-Instruct is a model for generating responses for various text-generation tasks including roleplaying, retrieval augmented generation, and function calling. It is a fine-tuned version of [nvidia/Mistral-NeMo-Minitron-8B-Base](https://huggingface.co/nvidia/Mistral-NeMo-Minitron-8B-Base), which was pruned and distilled from [Mistral-NeMo 12B](https://huggingface.co/nvidia/Mistral-NeMo-12B-Base) using [our LLM compression technique](https://arxiv.org/abs/2407.14679). The model was trained using a multi-stage SFT and preference-based alignment technique with [NeMo Aligner](https://github.com/NVIDIA/NeMo-Aligner). For details on the alignment technique, please refer to the [Nemotron-4 340B Technical Report](https://arxiv.org/abs/2406.11704). 

### License

[NVIDIA Community Model License](https://huggingface.co/nvidia/Nemotron-Mini-4B-Instruct/blob/main/nvidia-community-model-license-aug2024.pdf)"""

presentation2 = """
###  Model Architecture

🤖Nemotron-Mini-4B-Instruct uses a model embedding size of 3072, 32 attention heads, and an MLP intermediate dimension of 9216. It also uses Grouped-Query Attention (GQA) and Rotary Position Embeddings (RoPE). 

**Architecture Type:** Transformer Decoder (auto-regressive language model) 

**Network Architecture:** Nemotron-4 """

customtool = """{
  "name": "custom_tool",
  "description": "A custom tool defined by the user",
  "parameters": {
    "type": "object",
    "properties": {
      "param1": {
        "type": "string",
        "description": "First parameter of the custom tool"
      },
      "param2": {
        "type": "string",
        "description": "Second parameter of the custom tool"
      }
    },
    "required": ["param1"]
  }
}"""

example = """{{
  "name": "get_current_weather",
  "description": "Get the current weather in a given location",
  "parameters": {{
    "type": "object",
    "properties": {{
      "location": {{
        "type": "string",
        "description": "The city and state, e.g. San Francisco, CA"
      }},
      "unit": {{
        "type": "string",
        "enum": ["celsius", "fahrenheit"]
      }}
    }},
    "required": ["location"]
  }}
}}"""