File size: 5,986 Bytes

---
base_model:
- failspy/llama-3-70B-Instruct-abliterated
- migtissera/Tess-2.0-Llama-3-70B-v0.2
- NeverSleep/Llama-3-Lumimaid-70B-v0.1-alt
- abacusai/Llama-3-Giraffe-70B
library_name: transformers
tags:
- merge
- exl2
license: llama3
---
<p><h2>ExLlamaV2 Quantization</h2></p>
<p>Quantized with the default exllamav2 calibration dataset. Try this if you want a slightly different flavor than the RP calibrated (PIPPA) quants, with more emphasis on logic than emotion.</p>

[2.5 Bits Per Weight](https://huggingface.co/UnstableLlama/L3-MS-Astoria-70b-exl2-default-cal/tree/2_5)

[4.65 Bits Per Weight](https://huggingface.co/UnstableLlama/L3-MS-Astoria-70b-exl2-default-cal/tree/4_65)

---

<!DOCTYPE html>
<style>

body {
  font-family: 'Quicksand', sans-serif;
  background: linear-gradient(135deg, #2E3440 0%, #1A202C 100%); 
  color: #D8DEE9;
  margin: 0;
  padding: 0;
  font-size: 16px;
}

.container {
  width: 80% auto;
  max-width: 1080px auto;
  margin: 20px auto;
  background-color: rgba(255, 255, 255, 0.02);
  padding: 20px;
  border-radius: 12px;
  box-shadow: 0 4px 10px rgba(0, 0, 0, 0.2);
  backdrop-filter: blur(10px);
  border: 1px solid rgba(255, 255, 255, 0.1);
}

.header h1 {
  font-size: 28px;
  color: #ECEFF4;
  margin: 0 0 20px 0;
  text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.3);
}

.update-section {
  margin-top: 30px;
}

.update-section h2 {
  font-size: 24px;
  color: #88C0D0;
}

.update-section p {
  font-size: 16px;
  line-height: 1.6;
  color: #ECEFF4;
}

.info img {
  width: 100%;
  border-radius: 10px;
  margin-bottom: 15px;
}

a {
  color: #88C0D0;
  text-decoration: none;
}

a:hover {
  color: #A3BE8C;
}

.button {
  display: inline-block;
  background-color: #5E81AC;
  color: #E5E9F0;
  padding: 10px 20px;  
  border-radius: 5px;
  cursor: pointer;
  text-decoration: none;
}

.button:hover {
  background-color: #81A1C1;
}

pre {
  background-color: #2E3440;
  padding: 10px;
  border-radius: 5px;
  overflow-x: auto;
}

code {
  font-family: 'Courier New', monospace;
  color: #D8DEE9;
}

</style>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <title>L3-MS-Astoria-70b Data Card</title>
  <link href="https://fonts.googleapis.com/css2?family=Quicksand:wght@400;500;600&display=swap" rel="stylesheet">
</head>
<body>
  <div class="container">
    <div class="header">
      <h1>L3-MS-Astoria-70b</h1>
    </div>
    <div class="info">
      <img src="https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/HU5Zz7mb4X0wK3cZM2M9E.png">
      <p>Now that the cute anime girl has your attention.</p>
      <p><strong>Creator:</strong> <a href="https://huggingface.co/Steelskull" target="_blank">SteelSkull</a></p>
      <h1>About L3-MS-Astoria-70b:</h1>
      <p>L3 = Llama-3 <p/>
      <p>MS = Model Stock <p/>
      <p>This is my first foray into 70b models, so this is more or less an experiment, please let me know your thoughts on the model and where their can be improvements.<br>
         L3-MS-Astoria-70b combines the strengths of multiple models to deliver a well-rounded, capable assistant. It is aimed at performing general tasks, storytelling, roleplay, and more mature content.<br>
         The model stock merging method attempts to make the model remain focused, tailored, and high-quality.
      <h2>Quants:</h2>
    <p>(Thanks to <a href="https://huggingface.co/mradermacher">@Mradermacher!</a>, please send them likes and follows!)</p>
    <p><a href="https://huggingface.co/mradermacher/L3-MS-Astoria-70b-GGUF">L3-MS-Astoria-70b-GGUF (GGUFs)</a></p>    
    <p>ExLlamaV2 Quantization by UnstableLlama </p>
    <p>[5.00 bits per weight in main](https://huggingface.co/UnstableLlama/Xwin-LM-13B-V0.2-exl2/tree/main)</p>
    <p>[3.00 bits per weight branch](https://huggingface.co/UnstableLlama/Xwin-LM-13B-V0.2-exl2/tree/3bpw)</p>    
    <p></p>  
      <h3>Config:</h3>
      <pre><code>MODEL_NAME = "L3-MS-Astoria-70b"
yaml_config = """
base_model: failspy/llama-3-70B-Instruct-abliterated  
merge_method: model_stock
dtype: bfloat16
models:
  - model: migtissera/Tess-2.0-Llama-3-70B-v0.2
  - model: abacusai/Llama-3-Giraffe-70B
  - model: NeverSleep/Llama-3-Lumimaid-70B-v0.1-alt
"""
</code></pre>
      <h4>Source Model Details:</h4>
      <p><strong>migtissera/Tess-2.0-Llama-3-70B-v0.2:</strong><br>
        Tess, short for Tesoro (Treasure in Italian), is a general purpose Large Language Model series. Tess-2.0-Llama-3-70B-v0.2 was trained on the meta-llama/Meta-Llama-3-70B base. The change between v0.1 and this version, v0.2 is that v0.2 has undergone an additional step of uncensoring.
      </p>
      <p><strong>abacusai/Llama-3-Giraffe-70B:</strong><br>
        General model trained on 1b tokens, up to 128k ctx
      </p>
      <p><strong>NeverSleep/Llama-3-Lumimaid-70B-v0.1-alt:</strong><br>
        Llama3 trained on our RP datasets, NeverSleep tried to have a balance between the ERP and the RP, not too horny, but just enough.<br>
        NeverSleep also added some non-RP dataset, making the model less dumb overall. It should look like a 40%/60% ratio for Non-RP/RP+ERP data.
      </p>
      <p><strong>Base model failspy/llama-3-70B-Instruct-abliterated:</strong><br>
        This is meta-llama/Llama-3-70B-Instruct with orthogonalized bfloat16 safetensor weights, generated with the methodology that was described in the preview paper/blog post: 'Refusal in LLMs is mediated by a single direction' which I encourage you to read to understand more.<br>
        TL;DR: this model has had certain weights manipulated to "inhibit" the model's ability to express refusal. It is not in anyway _guaranteed_ that it won't refuse you, understand your request, it may still lecture you about ethics/safety, etc. It is tuned in all other respects the same as the original 70B instruct model was, just with the strongest refusal direction orthogonalized out.
      </p>
    </div>
  </div>
</body>
</html>