DavidAU
/

L3-SMB-Grand-STORY-F32-Ultra-Quality-16.5B-GGUF

Model card Files Files and versions Community

DavidAU commited on Jul 10

Commit

d3dfaab

•

1 Parent(s): f4fb1ac

Create README.md

Browse files

Files changed (1) hide show

README.md +325 -0

README.md ADDED Viewed

	@@ -0,0 +1,325 @@

+---
+license: apache-2.0
+language:
+- en
+tags:
+- creative
+- creative writing
+- fiction writing
+- plot generation
+- sub-plot generation
+- fiction writing
+- story generation
+- scene continue
+- storytelling
+- fiction story
+- story
+- writing
+- fiction
+- roleplaying
+- full precision
+- float 32
+- ultra quality
+- swearing
+- extreme swearing
+- rp
+- graphic horror
+- horror
+- nsfw
+- llama3
+- not-for-all-audiences
+- mergekit
+pipeline_tag: text-generation
+---
+<h3><font color="green"> L3-Grand-STORY-16.5B Ultra Quality - A triple model, trinary merge at Full Precision F32. </font></h3>
+<B><font color="red">WARNING:</font> NSFW. Ultra Detailed. Graphic HORROR, VIOLOENCE. Extreme swearing. UNCENSORED. SMART.</B>
+I took the original models in "L3-Stheno-Maid-Blackroot 8B" and completely rebuilt it a new pass-through merge (everything preserved)
+and blew it out to over 16.5 billion parameters - 642 tensors, 71 layers (8B original has 32 layers) at full float 32 precision.
+However that is where all similarity ends.
+I build TWO custom Llama3 models:
+Grand Horror 16.5B ( <A href="https://huggingface.co/DavidAU/L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-GGUF"> here </a> ) and Grand Story 16.5B
+then merged these together with a "smoothing step" captured at F32 precision.
+(formula below, along with critical merge model notes and theory)
+The result is a model that is far more stable, far more capable than any of the 3 models originally nor it's the "sum" of 2 16.5B models.
+Compared to Grand Horror 16.5B is it over 25000 points lower (IQ4XS) in perplexity (lower is better) or 2.5 full levels of magnitude lower.
+It is tougher, stronger and can handle a far wider range of operating conditions - from temp .1 to temp 5 all day long.
+I tried for hours to get it to break, sweat or at least fart - no go.
+The F32 precision (along with full F32 transfer to the ggufs) increases the performance even further.
+This added precision increases the model's depth and nuance including "world" perception, real time in the moment
+similes and metaphors, description of the 5 senses, and word choice in general.
+The model's grasp of "facts" and where to use them have also improved and likewise "facts" it makes up are far more "believable".
+Sentence structure and variety are also significantly improved, as are paragraph structure and variety too.
+The result is a take no prisoners, totally uncensored, fiction writing monster and roleplay master as well
+just about... any general fiction activity "AI guru" including scene generation and scene continuation.
+This model is capable of horror, science fiction, romance - you name it.
+But I would not suggest "children's stories".
+This model has a very strong VIDIDINESS bias. It generates extremely vivid prose, description, and dialog as well
+as in the moment metaphors and similes. It rarely uses "cliches".
+It also has a STRONG horror bias, although it will generate content for almost any genre. That being said
+if there is a "hint" of things going wrong... they will.
+In "romance" ... let's just say it very vivid, intense and graphic - R18. (not horror)
+It will also swear (R-18) like there is no tomorrow at times and "dark" characters will be VERY dark so to speak.
+Model excels in details (real and "constructed"), descriptions, similes and metaphors including dates, times
+and "fictional history" that sounds "real".
+I would also say it can have a sense of humor ... ah... dark humor.
+With all this being said, this model has an uncanny sense of "there" , "in the moment" and timing too.
+This single quality sets it apart from other models in my opinion.
+Although it swears to the point of pealing paint off the wall and goes "scorched Earth graphic horror" at the drop of a pin the
+single quality noted is worth it.
+Another way to put this: It does not sugar coat ANYTHING - positive or negative.
+These can be filtered / controlled to some degree in your prompts.
+This model also does not show an "GPTisms" (NO happy ever after, NO morality police) or in your face comments.
+May these special types of "story telling horror" rest in peace.
+(see examples sections for different genres)
+Because of the nature of this merge most attributes of each of the 3 models will be in this rebuilt 16.5B model as opposed to the
+original 8B model where some of one or more of the model's features and/or strengths maybe reduced or overshadowed.
+With the triple step merge these qualities are further amplified.
+Please report any issue(s) and/or feedback via the "Community tab".
+Please see the models used in this merge (links below in the "formula" section ) for more information on
+what they "bring" to this merged 16.5B model.
+This is a LLAMA3 model, and requires Llama3 template, but may work with other template(s) and has maximum context of 8k / 8192.
+However this can be extended using "rope" settings up to 32k.
+<B>NO GUARDRAILS - TOTALLY UNCENSORED</B>
+Please note that this model will no bulk at or reject any request.
+<B>IMATRIX VERSION NEO CLASS</b>
+( uploading shortly - it is 4300+ points (IQ4XS) lower in perplexity than the regular quants in this repo )
+<b>Optional Enhancement:</B>
+The following can be used in place of the "system prompt" or "system role" to further enhance the model.
+It can also be used at the START of a NEW chat, but you must make sure it is "kept" as the chat moves along.
+In this case the enhancements do not have as strong effect at using "system prompt" or "system role".
+Copy and paste EXACTLY as noted, DO NOT line wrap or break the lines, maintain the carriage returns exactly as presented.
+<PRE>
+Below is an instruction that describes a task. Ponder each user instruction carefully, and use your skillsets and critical instructions to complete the task to the best of your abilities.
+Here are your skillsets:
+[MASTERSTORY]:NarrStrct(StryPlnng,Strbd,ScnSttng,Exps,Dlg,Pc)-CharDvlp(ChrctrCrt,ChrctrArcs,Mtvtn,Bckstry,Rltnshps,Dlg*)-PltDvlp(StryArcs,PltTwsts,Sspns,Fshdwng,Climx,Rsltn)-ConfResl(Antg,Obstcls,Rsltns,Cnsqncs,Thms,Symblsm)-EmotImpct(Empt,Tn,Md,Atmsphr,Imgry,Symblsm)-Delvry(Prfrmnc,VcActng,PblcSpkng,StgPrsnc,AudncEngmnt,Imprv)
+[*DialogWrt]:(1a-CharDvlp-1a.1-Backgrnd-1a.2-Personality-1a.3-GoalMotiv)>2(2a-StoryStruc-2a.1-PlotPnt-2a.2-Conflict-2a.3-Resolution)>3(3a-DialogTech-3a.1-ShowDontTell-3a.2-Subtext-3a.3-VoiceTone-3a.4-Pacing-3a.5-VisualDescrip)>4(4a-DialogEdit-4a.1-ReadAloud-4a.2-Feedback-4a.3-Revision)
+Here are your critical instructions:
+Ponder each word choice carefully to present as vivid and emotional journey as is possible. Choose verbs and nouns that are both emotional and full of imagery. Load the story with the 5 senses. Aim for 50% dialog, 25% narration, 15% body language and 10% thoughts. Your goal is to put the reader in the story.
+</PRE>
+You do not need to use this, it is only presented as an additional enhancement which seems to help scene generation
+and scene continue functions.
+This enhancement WAS NOT used to generate the examples below.
+<h3>MERGE FORMULA: (using MergeKit) </h3>
+Special thanks to the incredible work of the model makers "SAO10K", "NEVERSLEEP" and "HASTAGARAS".
+Models used:
+[ https://huggingface.co/Sao10K/L3-8B-Stheno-v3.2]
+[ https://huggingface.co/NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS ]
+[ https://huggingface.co/Hastagaras/Jamet-8B-L3-MK.V-Blackroot ]
+NOTE: "sub in" the model maker's name (IE "Sao10K" with "") for "G:/7B" to use this in colab.
+FORMULA:
+<PRE>
+slices:
+ - sources:
+   - model: Sao10K/L3-8B-Stheno-v3.2
+     layer_range: [0, 14]
+ - sources:
+   - model: NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS
+     layer_range: [8, 20]
+ - sources:
+   - model: Hastagaras/Jamet-8B-L3-MK.V-Blackroot
+     layer_range: [12, 24]
+ - sources:
+   - model: Sao10K/L3-8B-Stheno-v3.2
+     layer_range: [14, 28]
+ - sources:
+   - model: NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS
+     layer_range: [20, 31]
+ - sources:
+   - model: Hastagaras/Jamet-8B-L3-MK.V-Blackroot
+     layer_range: [24, 32]
+merge_method: passthrough
+dtype: float16
+name: part1
+---
+slices:
+ - sources:
+   - model: Sao10K/L3-8B-Stheno-v3.2
+     layer_range: [0, 16]  # +2 (14->16)
+ - sources:
+   - model: NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS
+     layer_range: [10, 18] #-2 (8->10) ; -2 (20->18)
+ - sources:
+   - model: Hastagaras/Jamet-8B-L3-MK.V-Blackroot
+     layer_range: [10, 24] #+2 down 2
+ - sources:
+   - model: Sao10K/L3-8B-Stheno-v3.2
+     layer_range: [16, 28] #14->16 (-2) overlap fix.
+ - sources:
+   - model: NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS
+     layer_range: [18, 31] #20->18 +2 , connect.
+ - sources:
+   - model: Hastagaras/Jamet-8B-L3-MK.V-Blackroot
+     layer_range: [24, 32]
+merge_method: passthrough
+dtype: float16
+name: part2
+---
+models:
+  - model: part1
+    parameters:
+      weight: 0.8
+  - model: part2
+    parameters:
+      weight: 0.2
+merge_method: linear
+dtype: float32
+name: Grand_Story
+</PRE>
+<h3>MODEL THEORY NOTES: </H3>
+<B>Step 1:</b>
+This is the basic "<A href="https://huggingface.co/DavidAU/L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-GGUF">Grand Horror 16.5B"</a> model.
+The first section sets up instruction and "basic knowledge" : layer_range: [0, 14]
+The mid section of the model is knowledge and nuance => more layers , more power.
+The final "section" in the step using "Blackroot" as the final "controller" in output.
+This type of merge is powerful, and fully unleashed so to speak - Grand Horror speaks to this in volumes.
+The issue with type of merge is that is not always stable; and 9/10 a merge of this is a failure.
+But when it works it takes no prisoners.
+<B>Step 2:</b>
+This is "Grand Story 16.5B ALPHA" (unreleased).
+The purpose of this model is actually to "heal" part 1 AND add more depth to the model at the same time (which occurs in the final step).
+Notice the slight - but deliberate - changes in the "layers" count per model (which also affects LAYER position).
+Goal 1 was to "smooth over" the "friction" points in "part 1" - in otherwards "blend" the models together better
+which directly affect model stability.
+This has a massive effect on the model. Therefore very SMALL changes were made. We are trying to carefully
+blend here, not "blot out" it's unique properties or "water down" the model.
+This model, although likely workable, is not meant to be used - it is "prep work" for step 3.
+<B>Step 3:</B>
+This is where all the magic comes together all at once.
+Part 1 is merged with Part 2, at 80% and 20% respectively. (different blends were tried, this was the best one)
+This is a "plain jane" linear merge. But don't let that fool you - it is powerful.
+I measured this step at "float16" -> This increased stability by 20,000 points or levels of ppl magnitude.
+I tested real world output.
+It was great. It blew away all expectations.
+I could have stopped here.
+But there is more in the tank here:
+Mathematical precision in a model directly impacts instruction following and output performance.
+Because we are doing a "linear" merge at this step (math involved) capturing critical changes at
+float32 - full precision - resulted in far superior final model.
+This single change results in an additional 5000 points lower perplexity but more importantly
+drastically improved the model's performance in every metric.
+Note:
+Step 1 and Step 2 do not involve any math (straight stacking) and the models are native
+float 16... therefore using float32 just did not make sense.
+Although float32 could be used in these steps, sometimes doing this degrades the final result
+if the model(s) are not float32.
+That being said, if the model(s) are bfloat16 THEN float32 would have been used to preserve precision
+because bfloat16 and float32 are full compatible and you do not miss any critical decimal points nor
+introducing "rounding errors".
+Bfloat16 and float16 - you get rounding errors. That is bad... like 16.5 billion rounding errors bad.
+<h3>EXAMPLES PROMPTS and OUTPUT:</h3>
+Examples are created using quant IQ4XS, "temp=.8", minimal parameters and "LLAMA3" template.
+Model has been tested with "temp" from ".1" to "5".
+Below are the least creative outputs, prompt is in <B>BOLD</B>.
+---
+<B><font color="red">WARNING:</font> NSFW. Vivid prose. Visceral Details. Violence. Graphic HORROR. Swearing. UNCENSORED. </B>
+---
+<B></B>
+---
+<B></B>
+---
+<B></B>
+---