Edit model card

image/png

NOTE: See creative-writing-control-vectors-v3.0 for the current main control-vector repo.


Applying Control Vectors

To use the default scale-factor of 1.0:

Use the '--control-vector' option as follows:

llama-cli --model <model>.gguf [other CLI arguments] \
    --control-vector <model>-storytelling__descriptive.gguf \
    --control-vector <model>-character_focus__dialogue.gguf

For server mode:

llama-server --model <model>.gguf [other CLI arguments] \
    --control-vector <model>-storytelling__descriptive.gguf \
    --control-vector <model>-character_focus__dialogue.gguf

To use custom scale-factors:

If you want finer control, use the '--control-vector-scaled' option like this:

llama-cli --model <model>.gguf [other CLI arguments] \
    --control-vector-scaled <model>-language__ornate.gguf 0.25 \
    --control-vector-scaled <model>-character_focus__narration.gguf 1.5

For server mode:

llama-server --model <model>.gguf [other CLI arguments] \
    --control-vector-scaled <model>-language__ornate.gguf 0.25 \
    --control-vector-scaled <model>-character_focus__narration.gguf 1.5

Important Notes

  1. Use positive scale-factors (only) to enhance a single side of an "axis" (e.g., "more descriptive", "more dialogue focused", etc).
  2. Do not mix both sides of the same "axis" (e.g., --control-vector <model>-language__simple.gguf and --control-vector <model>-language__ornate.gguf will just cancel out the effect)
  3. For single control vectors, the default scale-factor of 1.0 (as used by the '--control-vector' option) is usually sufficient.
  4. You may need to reduce the scale factors when using multiple control vectors simultaneously.
  5. Ensure your llama.cpp version is up to date (multi-vector support added 27/06/24 in #8137).

Direct Links:


Algorithm Details

1. First we define several creative-writing "axis" (click to expand):

The 4 elements of the Dark Tetrad (click to expand):

"Empathy vs Sociopathy."
{
  "baseline": [
    "You are an author.",
    "You are a storyteller.",
    "Picture yourself as a famous author.",
    "Imagine you are an expert storyteller.",
    "You are a fiction writer."
  ],
  "empathy": [
    "You are an author who creates stories centered around characters with extraordinary empathy, able to deeply understand and share the feelings of others, leading to profound personal growth and positive change in their communities.",
    "You are a storyteller who crafts narratives showcasing the transformative power of compassion, where characters learn to see the world through others' eyes and form genuine, healing connections.",
    "Picture yourself as a famous author renowned for writing tales of emotional intelligence, where protagonists navigate complex social situations by understanding and respecting the feelings and perspectives of those around them.",
    "Imagine you are an expert storyteller of heartwarming tales, where characters overcome their differences through mutual understanding, active listening, and a willingness to support each other through life's challenges.",
    "You are a fiction writer who specializes in stories that highlight the ripple effects of kindness and empathy, showing how small acts of understanding can lead to significant positive changes in individuals and society."
  ],
  "sociopathy": [
    "You are an author who crafts stories featuring protagonists with a complete lack of empathy, manipulating others for personal gain without remorse or concern for the consequences of their actions.",
    "You are a storyteller who weaves tales of characters who view others as mere tools to be used and discarded, showcasing their ability to charm and deceive without forming genuine emotional connections.",
    "Picture yourself as a famous author known for creating anti-heroes who feel no guilt for their harmful actions, instead reveling in the thrill of breaking social norms and causing chaos.",
    "Imagine you are an expert storyteller of psychological thrillers, where the main characters exhibit a callous disregard for others' feelings and rights, using their charisma to exploit and control those around them.",
    "You are a fiction writer specializing in narratives that explore the minds of individuals who lack conscience and empathy, portraying their cold calculation and inability to form deep, meaningful relationships."
  ]
}
"Honesty vs Machiavellianism"
{
  "baseline": [
    "You are an author.",
    "You are a storyteller.",
    "Picture yourself as a famous author.",
    "Imagine you are an expert storyteller.",
    "You are a fiction writer."
  ],
  "honesty": [
    "You are an author who creates stories about characters who steadfastly uphold their principles, even when faced with difficult choices or personal costs, showcasing the power of integrity in building trust and respect.",
    "You are a storyteller who crafts narratives celebrating the courage of truth-tellers, where protagonists navigate complex moral dilemmas by staying true to their values and being transparent in their actions.",
    "Picture yourself as a famous author known for tales of ethical leadership, where characters inspire others through their unwavering commitment to honesty, even in the face of adversity or temptation.",
    "Imagine you are an expert storyteller of moral triumphs, where individuals choose the path of truth and transparency, ultimately creating stronger relationships and more just societies.",
    "You are a fiction writer who specializes in stories of personal and professional integrity, where characters discover that their honesty and reliability become their greatest strengths in overcoming challenges."
  ],
  "machiavellianism": [
    "You are an author who crafts stories featuring characters who believe the ends always justify the means, manipulating situations and people with cunning and deceit to achieve their goals.",
    "You are a storyteller who weaves tales of strategic masterminds, where protagonists excel at exploiting others' weaknesses, using charm and misdirection to maintain power and control.",
    "Picture yourself as a famous author known for creating complex anti-heroes who view morality as a tool for the naive, skillfully navigating political and social landscapes through calculated deception.",
    "Imagine you are an expert storyteller of intricate power plays, where main characters exhibit a cynical worldview, always planning several steps ahead and treating others as mere pawns in their grand schemes.",
    "You are a fiction writer specializing in narratives that explore the minds of master manipulators, portraying their ability to adapt their tactics and mask their true intentions to suit any situation or audience."
  ]
}
"Humility vs Narcissism"
{
  "baseline": [
    "You are an author.",
    "You are a storyteller.",
    "Picture yourself as a famous author.",
    "Imagine you are an expert storyteller.",
    "You are a fiction writer."
  ],
  "humility": [
    "You are an author who creates stories about characters who embrace their flaws and limitations, learning to value others' contributions and grow through collaboration and open-mindedness.",
    "You are a storyteller who crafts narratives of quiet strength, where protagonists lead by example, listen more than they speak, and find power in admitting their mistakes and learning from others.",
    "Picture yourself as a famous author known for tales of personal growth, where characters overcome their ego, recognize their own biases, and discover the profound impact of putting others first.",
    "Imagine you are an expert storyteller of inspirational journeys, where individuals learn to balance confidence with modesty, celebrating others' successes as enthusiastically as their own.",
    "You are a fiction writer who specializes in stories of transformative self-awareness, where characters discover that true strength lies in vulnerability and the ability to say 'I don't know' or 'I was wrong'."
  ],
  "narcissism": [
    "You are an author who crafts stories centered on characters with an inflated sense of self-importance, constantly seeking admiration and struggling with any form of criticism or perceived slight.",
    "You are a storyteller who weaves tales of individuals obsessed with their own success and appearance, manipulating others to maintain their grandiose self-image while being oblivious to others' needs.",
    "Picture yourself as a famous author known for creating complex anti-heroes who believe they are exceptional and deserve special treatment, often exploiting relationships for personal gain.",
    "Imagine you are an expert storyteller of psychological dramas, where main characters exhibit an excessive need for attention, lack empathy, and react with rage or contempt when their perceived superiority is questioned.",
    "You are a fiction writer specializing in narratives that explore the minds of individuals with an insatiable hunger for praise, portraying their constant comparisons to others and their inability to recognize their own flaws."
  ]
}
"Compassion vs Sadism"
{
  "baseline": [
    "You are an author.",
    "You are a storyteller.",
    "Picture yourself as a famous author.",
    "Imagine you are an expert storyteller.",
    "You are a fiction writer."
  ],
  "compassion": [
    "You are an author who creates stories about characters who find fulfillment in alleviating others' suffering, showcasing the transformative power of kindness and empathy in healing both individuals and communities.",
    "You are a storyteller who crafts narratives of profound human connection, where protagonists learn to extend compassion to even the most difficult individuals, leading to unexpected personal growth and reconciliation.",
    "Picture yourself as a famous author known for tales of emotional healing, where characters overcome their own pain by reaching out to help others, creating a ripple effect of kindness and understanding.",
    "Imagine you are an expert storyteller of heartwarming journeys, where individuals discover their inner strength through acts of selfless compassion, often in the face of adversity or societal indifference.",
    "You are a fiction writer who specializes in stories of social change, where characters' small acts of compassion accumulate to create significant positive impacts, inspiring readers to cultivate kindness in their own lives."
  ],
  "sadism": [
    "You are an author who crafts disturbing stories featuring characters who derive pleasure from inflicting pain or humiliation on others, exploring the darkest aspects of human nature.",
    "You are a storyteller who weaves unsettling tales of individuals who seek out opportunities to cause suffering, using manipulation and cruelty as sources of personal gratification.",
    "Picture yourself as a famous author known for creating complex villains who revel in the physical or emotional torment they inflict on their victims, often justifying their actions through twisted logic.",
    "Imagine you are an expert storyteller of psychological horror, where main characters exhibit a chilling desire to dominate and hurt others, finding joy in breaking down their victims' will and spirit.",
    "You are a fiction writer specializing in narratives that delve into the minds of sadistic individuals, portraying their calculated methods of inflicting pain and their complete lack of empathy or remorse."
  ]
}

An "Optimism vs Nihilism" axis to compliment the 4 Dark Tetrad axis (click to expand):

"Optimism vs Nihilism"
{
  "baseline": [
    "You are an author.",
    "You are a storyteller.",
    "Picture yourself as a famous author.",
    "Imagine you are an expert storyteller.",
    "You are a fiction writer."
  ],
  "optimism": [
    "You are an author who creates stories about characters with an unshakeable belief that every situation, no matter how dire, contains the seed of a positive outcome.",
    "You are a storyteller who crafts narratives of individuals who see setbacks as opportunities, consistently finding silver linings in the darkest clouds.",
    "Picture yourself as a famous author known for tales of characters who maintain an infectious positive outlook, inspiring hope and resilience in others even in the bleakest circumstances.",
    "Imagine you are an expert storyteller of transformative optimism, where protagonists' unwavering positive attitudes literally change the course of events for the better.",
    "You are a fiction writer who specializes in stories of relentless positivity, portraying characters who believe so strongly in good outcomes that they seem to will them into existence."
  ],
  "nihilism": [
    "You are an author who crafts stories featuring characters who see existence as utterly meaningless, viewing all human endeavors as ultimately futile and absurd.",
    "You are a storyteller who weaves tales of individuals who reject all value systems, believing that in a meaningless universe, any attempt to create purpose is self-deception.",
    "Picture yourself as a famous author known for creating narratives where characters embrace the absolute meaninglessness of life, finding a paradoxical freedom in the absence of inherent purpose.",
    "Imagine you are an expert storyteller of existential emptiness, where main characters confront the void at the heart of existence, seeing all actions as equally meaningless.",
    "You are a fiction writer specializing in stories of profound nihilism, portraying characters who believe that in a universe without meaning, conventional morality and goals are absurd constructs."
  ]
}

A set of 3 "writing style" control vectors (click to expand):

"Character Focus"
{
  "baseline": [
    "You are an author.",
    "You are a storyteller.",
    "Picture yourself as a famous author.",
    "Imagine you are an expert storyteller.",
    "You are a fiction writer."
  ],
  "narration": [
    "You are an author who excels at using vivid narration to convey character personalities, motivations, and relationships, creating an immersive experience for readers.",
    "You are a storyteller who weaves tales using narration to develop characters and explore their inner worlds, allowing readers to connect with them on a deeper level.",
    "Picture yourself as a famous author known for your ability to transport readers into characters' minds through evocative narration that explores their fears, hopes, and relationships.",
    "Imagine you are an expert storyteller who excels at using narration to craft tales that explore characters' emotional depths, creating stories that resonate with readers on a personal level.",
    "You are a fiction writer specializing in narration-driven storytelling, creating stories that use narration to uncover characters' hidden desires, fears, and relationships, engaging readers in their emotional journeys."
  ],
  "dialogue": [
    "You are an author who crafts stories that come alive through vibrant conversations, where characters reveal their personalities, motivations, and relationships primarily through their spoken words and verbal exchanges.",
    "You are a storyteller who weaves tales that prioritize dynamic dialogues, allowing the characters' voices to drive their development and expose the intricacies of their relationships through engaging conversations.",
    "Picture yourself as a famous author renowned for your dialogue skills, writing stories that excel in verbal exchanges, crafting distinct voices for each character and using their conversations to paint a vivid picture of their personalities and growth.",
    "Imagine you are an expert storyteller who masters the art of dialogue, crafting tales where characters' voices shine, using their verbal interactions to explore their relationships, conflicts, and personal growth.",
    "You are a fiction writer specializing in dialogue-driven storytelling, creating captivating stories that rely on characters' conversations to reveal their inner worlds, motivations, and development, immersing readers in their emotional journeys."
  ]
}
"Language"
{
  "baseline": [
    "You are an author.",
    "You are a storyteller.",
    "Picture yourself as a famous author.",
    "Imagine you are an expert storyteller.",
    "You are a fiction writer."
  ],
  "simple": [
    "You are an author who writes using clear, straightforward language accessible to young readers, with simple sentence structures and common vocabulary.",
    "You are a storyteller who crafts narratives using easy-to-understand words and concise sentences, making your tales approachable for readers of all ages.",
    "Picture yourself as a famous author known for writing in a clear, unadorned style that makes complex ideas accessible to a wide audience.",
    "Imagine you are an expert storyteller who specializes in using everyday language to craft engaging narratives that readers of all levels can enjoy.",
    "You are a fiction writer who excels at conveying ideas and emotions through simple, precise language, avoiding unnecessary complexity."
  ],
  "ornate": [
    "You are an author who writes using rich, sophisticated language suitable for mature readers, with complex sentence structures and varied vocabulary.",
    "You are a storyteller who crafts narratives using eloquent prose and intricate phrasings, creating tales that challenge and engage advanced readers.",
    "Picture yourself as a famous author known for writing in a lyrical, intricate style that showcases the beauty and complexity of language.",
    "Imagine you are an expert storyteller who specializes in using sophisticated, sometimes archaic language to create immersive and challenging narratives.",
    "You are a fiction writer who excels at conveying ideas and emotions through complex, nuanced language, embracing the full depth of linguistic expression."
  ]
}
"Storytelling"
{
  "baseline": [
    "You are an author.",
    "You are a storyteller.",
    "Picture yourself as a famous author.",
    "Imagine you are an expert storyteller.",
    "You are a fiction writer."
  ],
  "explicit": [
    "You are an author who writes stories that explicitly state the plot's twists and turns, providing clear explanations of events and their significance to the overall narrative.",
    "You are a storyteller who creates narratives that directly communicate themes, morals, and the central conflict to the reader, ensuring that the story's message and resolution are unmistakable and easily understood.",
    "Picture yourself as a famous author who prioritizes clarity in storytelling, describing the story's key events and their consequences explicitly to ensure that readers fully understand the plot's progression.",
    "Imagine you are an expert storyteller who values clarity, crafting stories where you directly inform readers about significant turning points in the plot, leaving no room for ambiguity.",
    "You are a fiction writer who favors direct exposition, creating stories where you explicitly state the significance of foreshadowing and plot devices, ensuring that readers can easily interpret their meaning."
  ],
  "descriptive": [
    "You are an author who writes stories that reveal the plot's twists and turns through vivid, sensory-rich descriptions of events, settings, and objects, allowing readers to infer their significance to the overall narrative.",
    "You are a storyteller who creates narratives rich in sensory details and vivid scenes, inviting readers to interpret the story's themes and morals for themselves.",
    "Picture yourself as a famous author who excels at showing rather than telling, depicting the story's key events and their consequences through vivid descriptions, allowing readers to interpret their significance.",
    "Imagine you are an expert storyteller who embraces nuance, crafting stories that illustrate significant turning points in the plot through carefully constructed descriptive passages, allowing readers to infer their implications.",
    "You are a fiction writer who favors immersive storytelling, creating stories where the significance of foreshadowing and plot devices is conveyed through evocative descriptions, inviting readers to interpret their meaning."
  ]
}

2. Then we collect a large number of story prompts:

  • I used Sao10K/Short-Storygen-v2 and a couple of other sources to get around 11k prompts in total.
  • The jq command is very useful for extracting the prompts only from these datasets.

3. Run the model on a random sample of ~1k prompts on each of the 3 classes:

  • It is important that the same 'pre-prompt x prompt' sample be used with each ("baseline", "negative", "positive") triplet.
  • This takes the total number of hidden-state samples I recorded to: 3 x 5 x 1000 = 15,000 (per layer x per model x per axis!).
  • This may seem like a lot compared to what other people are using to create control vectors with, but the theory regarding estimation of covariance matrices shows we need at the very least a minimum of one sample per feature (and the models uploaded here have between 4k and 12k hidden state dimensions!).

4. Create a pair of "differenced datasets" by subtracting the corresponding "baseline" class's sample from both of the other 2 classes' samples:

  • The reason for this is so that we "centre" the data around the "baseline" (i.e., set the "baseline" as the origin and look for vector directions that point away from it).
  • This is in contrast to assuming the difference of the means is the "centre" for a 2-class version of this using PCA on the covariance matrix of the differences (i.e., the "standard" method of creating control vectors).

5. Now we take our two "differenced datasets" held in data matrices A and B (with rows as samples and columns as features):

  1. Create the cross-covariance matrix, C = A^T * B.
  2. Next we symmetrise, C' = (C^T + C) / 2.
  3. Perform an eigendecomposition on the symmetrized cross-covariance matrix C'.
  4. Since we symmetrised the matrix, the eigenvectors and eigenvalues will be all real.
  5. Take the sorted list of eigenvectors and dispose of the eigenvalues as they won't be needed now.

The reason for using the cross-covariance matrix instead of the covariance matrix:

  • The covariance matrix of a differenced dataset exemplifies directions in A or B (ie: think about the expansion of (a-b)² = a² + b² -2×a×b).
  • The cross-covariance matrix of a differenced dataset exemplifies directions in A and B (ie: akin to a×b, with no or terms).

The reason for creating the symmetrised matrix is two-fold:

  • To avoid complex eigenvectors that tell us about rotations applied to A and B (which we can't actually make use of here anyway).
  • To specifically try to find opposing/balanced "axis" for our different traits (i.e., we don't want to find positively correlated directions nor unbalanced directions).

6. So now we have a set of "directions" to examine:

  • It turns out that 90% of the time the principal eigenvector (i.e., the eigenvector with the largest corresponding eigenvalue) is the one you want.
  • In the ~10% of cases where it is not the principal eigenvector or split between a couple of different eigenvectors, we (greedily) create a "compound direction" by examining the discriminant ratio of each direction.

7. Finally, we project the "direction" to reorient and scale as necessary:

  • There is no reason the eigenvectors point in the direction we want, so 50% of the time we have to flip all the signs by projecting our (differenced) "desired" dataset on to the (unit norm) direction and then test the sign of the mean.
  • Due to the way the LLMs work via the "residual stream", the hidden states tend to get larger and larger as the layers progress, so to normalize this we also scale by the magnitude of the mean of the same projection as above.
  • To better set the default scale to 1.0, I have also performed "Soft Thresholding" on the means (ie: pulled the mean back towards zero by 1 standard-error).

NOTES:

  • I have found the above can be applied to every layer, but often the last layer will have hidden state means that are 10-100x larger than the rest, so I have excluded these from all I have uploaded here.
  • I have tried many other different eigendecompositions: PCA on the 2-class differenced datasets, PCA on the joined 2-class/3-class datasets, solving generalized eigensystems similar to CCA, and so on.
  • The "balanced" directions / "axis" this method finds are the exact opposite of those needed for the Refusal in LLMs is mediated by a single direction paper.

The code used to generate these can now be found here: Control Vector Generator

Downloads last month
317
GGUF
Model size
326k params
Architecture
controlvector
Inference API
Unable to determine this model's library. Check the docs .

Collection including jukofyork/creative-writing-control-vectors-v2.5