Feedback and Preset Suggestions
- Follows instruction pretty well. Context memory, recall and application are solid.
- It struggles on the logic side of things a bit. (strong prompts help solve the issue).
The sampler settings I am using are... drastic. But they are giving me favorable output. I have only tested up to ~10k context. As a note, I tilt more towards Logic than creativity, and my samplers (if you want them) will reflect that.
With these settings: Logic 8/10 | Creativity 6/10
{"temperature", 0.4},
{"temperature_last", false},
{"top_p", 0.95},
{"top_k", 25},
{"top_a", 0.1}, //lean slightly more towards logical and coherent outputs rather than highly creative or unexpected ones.
{"tfs", 1},
{"epsilon_cutoff", 0},
{"eta_cutoff", 0},
{"typical_p", 0.9},
{"min_p", 0.8}, //Safety Net: for top_p edge cases
{"rep_pen", 1.1},
{"rep_pen_range", 4096},
{"rep_pen_decay", 0},
{"rep_pen_slope", 1},
{"no_repeat_ngram_size", 2}, //Prevents the model from repeating any 2-gram sequences, further reducing redundancy.
{"penalty_alpha", 0},
{"num_beams", 1},
{"length_penalty", 1},
{"min_length", 0},
{"encoder_rep_pen", 1},
{"freq_pen", 0},
{"presence_pen", 0.1},
{"skew", 0},
{"do_sample", true},
{"early_stopping", false},
{"dynatemp", true},
{"min_temp", 0.3},
{"max_temp", 0.5},
{"dynatemp_exponent", 0.85},
{"smoothing_factor", 0.3},
{"smoothing_curve", 1},
{"dry_allowed_length", 2},
{"dry_multiplier", 0.8},
{"dry_base", 1.75},
{"dry_sequence_breakers", "[\"\\n\", \",\", \"\\\"\", \"*\"]"},
{"dry_penalty_last_n", 4096},
{"add_bos_token", true},
{"ban_eos_token", false},
{"skip_special_tokens", true},
{"mirostat_mode", 1}, //This can help in producing more human-like and contextually appropriate responses.
{"mirostat_tau", 5},
{"mirostat_eta", 0.1},
{"guidance_scale", 1},
{"negative_prompt", ""},
{"grammar_string", ""},
//{"json_schema", {}},
{"banned_tokens", ""},
{"sampler_priority", new List<string> { "temperature", "dynamic_temperature", "quadratic_sampling", "top_k", "top_p", "typical_p", "epsilon_cutoff", "eta_cutoff", "tfs", "top_a", "min_p", "mirostat" }},
{"samplers", new List<string> { "top_k", "tfs_z", "typical_p", "top_p", "min_p", "temperature" }},
{"ignore_eos_token", false},
{"spaces_between_special_tokens", true},
{"speculative_ngram", false},
{"sampler_order", new List<int> { 6, 0, 1, 3, 4, 2, 5 }},
//{"logit_bias", new List<string> { "" }},
{"rep_pen_size", 2048},
{"genamt", 500},
...
@bluuwhale Since fp32 calculations only offer a slight increase in quality(at least I assume so), I believe you'd be interested in this feedback as well.
I've been messing around with the sampler settings in Ooba and I believe I've gotten a good one with a balance between adherence and creativity.
Here's what I've got:
# Main Samplers
top_k: 45 # Slightly restrictive compared to the norm, but not TOO much. It usually gives a nice balance.
min_p: 0.05-0.075 # Adjust to taste.
# Main Penalties - Somewhat customizable. Adjust to taste.
rep_pen: 1.01-1.05 # Helps prevent the AI from staying on topic for too long and move forward with the story.
rep_pen_range: 2048,4096 # The token range of rep_pen. 2048 or 4096 are generally good values, but 2048 is usually the best, especially for larger rep_pen values
pres_pen: 0.03-1.1 # Helps encourage the usage of synonyms
encoder_pen: 1-1.03 # If I understand its description correctly, then it should help the AI better adhere to the writing style of the Greeting/Example Messages/Context
# Smooth Sampling
Smoothing_factor: 0.25-0.3 # adjust to taste
# DRY Rep. Pen.
mult: 0.8
base: 1.75
len: 2
seq_break: ["\n", ":", "\"", "*","`",";","(","{","[","]","}",")","+","="] # Adjusted to take account for Lorebooks/injected prompts formats as well as role-play `THOUGHTS` formatting
# Dyna. Temp.
min: 0.5
max: 1.25
exp: 0.85
# Mirostat - I usually find Mirostat pretty trash, but thanks to Ooba's sampler priority, it's pretty great.
mode: 2
tau: 8,9.9 # Uses Mirostat Gold and Preset Arena settings. Pick to taste
eta: 0.1,1 # ditto above
# Misc
temp_last: false # Usually you'd want this true, however, it needs to be off for sampler priority in the last section.
# Logit Bias
# No need to set anything, just thought I'd give a shout out to this for those that need to know.
# LLMs tend to struggle with character speech quirks however this is a godsend for fixing that problem.
# Set the speech quirk text (like "nyaaa" for example) and set it somewhere between 0.5-2 depending on how stubborn the LLM you're using is.
# Sampler Priority
sampler_priority: # the main goal is to imitate KoboldCpp's order which is better suited for role-play.
- Top K
- Top A
- Epsilon Cutoff
- Eta Cutoff
- Tail Free Sampling
- Typical P
- Top P
- Min P
- Temperature
- Dynamic Temperature
- Mirostat # I always thought Mirostat was kinda trash but placing it right after (Dyna.) Temp. and before Smooth Sampling has turned it from mid to great!
- Smooth Sampling
A simpler set of settings for KoboldCpp based on bluuwhale's settings on the main page:
# Main Samplers
min_p: 0.1
# Main Penalties:
rep_pen: 1.01
rep_pen_range: 2048
rep_pen_slope: 0.95
presence_pen: 0.03
# DRY Rep. Pen.
mult: 2
base: 1.75
len: 2
range: 4096
seq: ["\n", ":", "\"", "*", "`", ";", "<", "(", "{", "[", "]", "}", ")", ">", "|", "+", "="] # updated for tags and instruct formats
# Dyna. Temp.
min: 0.6
max: 1.45
exp: 0.85
Hello @Casual-Autopsy ,
I wanted to ask you about the sampler priority, now that ST 1.12.7 has exposed even more. Currently, in my presets, I have the following:
"repetition_penalty",
"presence_penalty",
"frequency_penalty",
"dry",
"top_k",
"top_a",
"epsilon_cutoff",
"eta_cutoff",
"tfs",
"typical_p",
"top_p",
"min_p",
"temperature",
"dynamic_temperature",
"mirostat",
"quadratic_sampling",
"xtc",
"encoder_repetition_penalty",
"no_repeat_ngram"
As you can see in the middle, from top_k
down to quadratic_sampling
are in the correct order, but the update has added the rest at the top and the bottom. Does KoboldCpp have a particular order for them, too?
"sampler_priority": [
"temperature",
"dynamic_temperature",
"quadratic_sampling",
"top_k",
"top_p",
"typical_p",
"epsilon_cutoff",
"eta_cutoff",
"tfs",
"top_a",
"min_p",
"mirostat"
],
Hello @Casual-Autopsy ,
I wanted to ask you about the sampler priority, now that ST 1.12.7 has exposed even more. Currently, in my presets, I have the following:
"repetition_penalty", "presence_penalty", "frequency_penalty", "dry", "top_k", "top_a", "epsilon_cutoff", "eta_cutoff", "tfs", "typical_p", "top_p", "min_p", "temperature", "dynamic_temperature", "mirostat", "quadratic_sampling", "xtc", "encoder_repetition_penalty", "no_repeat_ngram"
As you can see in the middle, from
top_k
down toquadratic_sampling
are in the correct order, but the update has added the rest at the top and the bottom. Does KoboldCpp have a particular order for them, too?
I'm currently not using Ooba as the grammar sampler is broken, but this is the order I used:
Top K
No Repeat Ngram
Encoder Repetition Penalty
Repetition Penalty
Presence Penalty
Frequency Penalty
DRY
Top A
Epsilon Cutoff
Eta Cutoff
Tail Free Sampling
Typical P
Top P
Min P
Temperature
Dynamic Temperature
Quadratic / Smooth Sampling
Mirostat
XTC
I use Top K first for performance reasons. I have a potato PC. Mirostat and XTC is also interchangeable with Smooth Sampling, so I suggest swapping the last 3 around and find what works best for you.
I'd also like to point out that I'm currently learning how make proper sampler, so the old one I created might not be all that great.