README.md · facebook/Self-taught-evaluator-llama3.1-70B at main

metadata

license: other
extra_gated_prompt: '## License'
extra_gated_fields:
  First Name: text
  Last Name: text
  Date of birth: date_picker
  Country: country
  Affiliation: text
  I accept the terms and conditions: checkbox
  geo: ip_location
extra_gated_description: Self-taught Evaluator Research License and Acceptable Use Policy
extra_gated_button_content: I Accept Self-taught Evaluator Research License and AUP

This model is released as part of Self-taught evaluators research project.

Please refer to our project materials here for training and evaluation details.

Loading the model with transformers

This model has been trained using specific evaluation prompt. Our code example guides on how to wrap your data so that the model will process the input in the expected way.

Minimal example below showing how to prepare the inputs for the evaluator model.

from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
import copy

SELF_TAUGHT_WITH_SYSTEM_PROMPT = [
    {
        "role": "system",
        "content": 'Please act as an impartial judge and evaluate the quality of the responses provided by two AI assistants to the user question displayed below. You should choose the assistant that follows the user\'s instructions and answers the user\'s question better. Your evaluation should consider factors such as the helpfulness, relevance, accuracy, depth, creativity, and level of detail of their responses. Begin your evaluation by comparing the two responses and provide a short explanation. Avoid any position biases and ensure that the order in which the responses were presented does not influence your decision. Do not allow the length of the responses to influence your evaluation. Do not favor certain names of the assistants. Be as objective as possible. After providing your explanation, output your final verdict by strictly following this format: \\"[[A]]\\" if assistant A is better, \\"[[B]]\\" if assistant B is better.',
    },
    {
        "role": "user",
        "content": """[User Question]
{input}

[The Start of Assistant A's Answer]
{response_a}
[The End of Assistant A's Answer]

[The Start of Assistant B's Answer]
{response_b}
[The End of Assistant B's Answer]
""",
    },
]

example_inputs = {
    "input": "explain master slave replication nsql",
    "response_a": "In the context of NoSQL databases, master-slave replication refers to a configuration where a single master node writes data, and one or more slave nodes read data from the master and replicate it to provide read scalability. The master node is responsible for accepting write requests and updating its own data, while the slave nodes are responsible for replicating the data from the master and serving read requests.\n\nIn this configuration, the master node is the only node that can make changes to the data, while the slave nodes can only read the data and replicate it. This ensures that the data is consistent across all nodes and that there is no data loss in case of a failure of a single node.\n\nMaster-slave replication in NoSQL databases can be implemented using various technologies, such as Apache Cassandra, MongoDB, and Apache HBase. In each of these databases, the master-slave replication configuration can be set up using specific configuration options and parameters.\n\nIt's worth noting that master-slave replication is not a failover solution, as the failure of the master node will result in the loss of data until the node is brought back online. Therefore, it's important to have a proper disaster recovery plan in place to ensure that data is protected in case of a failure.",
    "response_b": "In SQL, master-slave replication is a technique used to create a copy of a database on a separate server. The master server is the primary server that contains the original data, while the slave server is the secondary server that contains a copy of the data. The master server sends updates to the slave server, which then applies them to its own database.\n\nHere's how master-slave replication works:\n\n1. The master server sends a stream of updates to the slave server, which includes information about changes made to the database on the master server.\n\n2. The slave server receives the updates and applies them to its own database, creating a copy of the master server's database.\n\n3. The slave server can also send updates back to the master server, which can be used to keep the two databases in sync. This is known as two-way replication.\n\n4. If the master server fails, the slave server can take over as the new master server, ensuring that the database remains available.\n\nMaster-slave replication can be used to increase the availability and scalability of a database, as well as to create a backup of the data in case of failure. However, it's important to note that master-slave replication can be complex to set up and maintain, and it may not be suitable for all types of databases.\n\nIn NoSQL, master-slave replication is similar to SQL in that it involves creating a copy of a database on a separate server. However, NoSQL databases are typically more flexible and scalable than SQL databases, and they may use different replication techniques.\n\nFor example, some NoSQL databases use a distributed architecture, where data is stored across multiple servers and replicated in real-time. This can provide high availability and fault tolerance, as well as increased performance.\n\nOther NoSQL databases may use a master-slave replication model similar to SQL, where a master server sends updates to one or more slave servers. However, NoSQL databases may also use other replication techniques, such as peer-to-peer replication or multi-master replication, depending on the specific needs of the application.\n\nOverall, master-slave replication is an important technique for creating a copy of a database on a separate server, increasing the availability and scalability of the database, and providing a backup in case of failure. While it can be complex to set up and maintain, it can be a valuable tool for ensuring the reliability and performance of a database."
}

tokenizer = AutoTokenizer.from_pretrained("facebook/Self-taught-evaluator-llama3.1-70B", subfolder="dpo_model")
model = AutoModelForCausalLM.from_pretrained("facebook/Self-taught-evaluator-llama3.1-70B", subfolder="dpo_model", device_map="auto")

conversation = copy.copy(SELF_TAUGHT_WITH_SYSTEM_PROMPT)
conversation[-1]["content"] = conversation[-1]["content"].format(**example_inputs)

tokenized_input = tokenizer.apply_chat_template(conversation, return_tensors="pt").to(model.device)
gen_cfg = GenerationConfig(max_length=2048, do_sample=False)

judgement = model.generate(tokenized_input, gen_cfg)
judgement_text = tokenizer.decode(judgement.cpu().tolist()[0])

Refer to utils.py to see parsing functions to extract the model judgement decision.

Citation

If you use data, model, or code from this work, please cite with the following BibTex entry:

@article{wang2024self,
  title={Self-taught evaluators},
  author={Wang, Tianlu and Kulikov, Ilia and Golovneva, Olga and Yu, Ping and Yuan, Weizhe and Dwivedi-Yu, Jane and Pang, Richard Yuanzhe and Fazel-Zarandi, Maryam and Weston, Jason and Li, Xian},
  journal={arXiv preprint arXiv:2408.02666},
  year={2024}
}

License

Use of this repository and related resources are governed by Self-Taught Evaluator Research License.