--- license: mit language: - ru tags: - PyTorch - Transformers --- # BERT base model for pair ranking (reward model for RLHF) in Russian language. For training i use the next [pair-ranking-loss](https://pytorch.org/docs/stable/generated/torch.nn.MarginRankingLoss.html) Model based on [ruBert-base](https://huggingface.co/sberbank-ai/ruBert-base) Datasets have been translated with google-translate-api for reward training: - [Anthropic/hh-rlhf](https://huggingface.co/datasets/Anthropic/hh-rlhf) - [Dahoas/synthetic-instruct-gptj-pairwise](https://huggingface.co/datasets/Dahoas/synthetic-instruct-gptj-pairwise) - [openai/webgpt_comparisons](https://huggingface.co/datasets/openai/webgpt_comparisons) Firstly download custom model localy. You can do it manualy. OR: - git lfs install; - git clone https://huggingface.co/Andrilko/ruBert-base-reward OR look at [this manual](https://huggingface.co/docs/hub/models-downloading) ## Usage (HuggingFace Models Repository) You can use the model directly from the model repository to compute score: ```python #Use custom model class: import torch import torch.nn as nn from transformers import AutoTokenizer, AutoModel, AdamW, BertModel class RewardModel(nn.Module): def __init__(self, model_name): super(RewardModel, self).__init__() self.checkpoint = model_name self.bert = AutoModel.from_pretrained(model_name, return_dict=False) self.layer_norm = nn.LayerNorm(768) self.dropout = nn.Dropout(0.3) self.dense = nn.Sequential( nn.Linear(768, 512), nn.LeakyReLU(negative_slope=0.01), nn.Dropout(0.3), nn.Linear(512, 1), nn.Sigmoid() ) def forward(self, input_ids, token_type_ids, attention_mask): model_output = self.bert(input_ids=input_ids, token_type_ids = token_type_ids, attention_mask=attention_mask) last_hidden_states = model_output[0] pooled_output = last_hidden_states[:,0] pooled_output = self.layer_norm(pooled_output) pooled_output = self.dropout(pooled_output) preds = self.dense(pooled_output) return preds #Create model object and init pretrain weights: reward_name = "ai-forever/ruBert-base" tokenizer=AutoTokenizer.from_pretrained(reward_name) model = RewardModel(reward_name) model.load_state_dict(torch.load('./ruBert-base-reward/pytorch_model.bin')) device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") #Sentences that we want to score: sentences = ['Человек: Что такое QR-код?', 'Ассистент: QR-код - это тип матричного штрих-кода.'] #Compute reward score: with torch.no_grad(): model.to(device) encoded_input = tokenizer(sentences[0],sentences[1], truncation=True, add_special_tokens=True, max_length=512, padding='max_length', return_tensors='pt') encoded_input = encoded_input.to(device) score = model(**encoded_input).cpu().flatten().numpy() print(score) ``` # Authors + Aleksandr Abramov: [Github](https://github.com/Ab1992ao), [Kaggle Competitions Master](https://www.kaggle.com/andrilko);