--- library_name: transformers model-index: - name: internlm-chatbode-7b results: - task: type: text-generation name: Text Generation dataset: name: ENEM Challenge (No Images) type: eduagarcia/enem_challenge split: train args: num_few_shot: 3 metrics: - type: acc value: 63.05 name: accuracy source: url: >- https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/internlm-chatbode-7b name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: BLUEX (No Images) type: eduagarcia-temp/BLUEX_without_images split: train args: num_few_shot: 3 metrics: - type: acc value: 51.46 name: accuracy source: url: >- https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/internlm-chatbode-7b name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: OAB Exams type: eduagarcia/oab_exams split: train args: num_few_shot: 3 metrics: - type: acc value: 42.32 name: accuracy source: url: >- https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/internlm-chatbode-7b name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Assin2 RTE type: assin2 split: test args: num_few_shot: 15 metrics: - type: f1_macro value: 91.33 name: f1-macro source: url: >- https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/internlm-chatbode-7b name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Assin2 STS type: eduagarcia/portuguese_benchmark split: test args: num_few_shot: 15 metrics: - type: pearson value: 80.69 name: pearson source: url: >- https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/internlm-chatbode-7b name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: FaQuAD NLI type: ruanchaves/faquad-nli split: test args: num_few_shot: 15 metrics: - type: f1_macro value: 79.8 name: f1-macro source: url: >- https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/internlm-chatbode-7b name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HateBR Binary type: ruanchaves/hatebr split: test args: num_few_shot: 25 metrics: - type: f1_macro value: 87.99 name: f1-macro source: url: >- https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/internlm-chatbode-7b name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: PT Hate Speech Binary type: hate_speech_portuguese split: test args: num_few_shot: 25 metrics: - type: f1_macro value: 68.09 name: f1-macro source: url: >- https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/internlm-chatbode-7b name: Open Portuguese LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: tweetSentBR type: eduagarcia/tweetsentbr_fewshot split: test args: num_few_shot: 25 metrics: - type: f1_macro value: 61.11 name: f1-macro source: url: >- https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard?query=recogna-nlp/internlm-chatbode-7b name: Open Portuguese LLM Leaderboard language: - pt pipeline_tag: text-generation --- # internlm-chatbode-7b
O InternLm-ChatBode é um modelo de linguagem ajustado para o idioma português, desenvolvido a partir do modelo [InternLM2](https://huggingface.co/internlm/internlm2-chat-7b). Este modelo foi refinado através do processo de fine-tuning utilizando o dataset UltraAlpaca. ## Características Principais - **Modelo Base:** [internlm/internlm2-chat-7b](internlm/internlm2-chat-7b) - **Dataset para Fine-tuning:** UltraAlpaca - **Treinamento:** O treinamento foi realizado a partir do fine-tuning, usando QLoRA, do internlm2-chat-7b. ## Exemplo de uso A seguir um exemplo de código de como carregar e utilizar o modelo: ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("recogna-nlp/internlm-chatbode-7b", trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained("recogna-nlp/internlm-chatbode-7b", torch_dtype=torch.float16, trust_remote_code=True).cuda() model = model.eval() response, history = model.chat(tokenizer, "Olá", history=[]) print(response) response, history = model.chat(tokenizer, "O que é o Teorema de Pitágoras? Me dê um exemplo", history=history) print(response) ``` As respostas podem ser geradas via stream utilizando o método `stream_chat`: ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer model_path = "recogna-nlp/internlm-chatbode-7b" model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.float16, trust_remote_code=True).cuda() tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) model = model.eval() length = 0 for response, history in model.stream_chat(tokenizer, "Olá", history=[]): print(response[length:], flush=True, end="") length = len(response) ``` # Open Portuguese LLM Leaderboard Evaluation Results Detailed results can be found [here](https://huggingface.co/datasets/eduagarcia-temp/llm_pt_leaderboard_raw_results/tree/main/recogna-nlp/internlm-chatbode-7b) and on the [🚀 Open Portuguese LLM Leaderboard](https://huggingface.co/spaces/eduagarcia/open_pt_llm_leaderboard) | Metric | Value | |--------------------------|---------| |Average |**69.54**| |ENEM Challenge (No Images)| 63.05| |BLUEX (No Images) | 51.46| |OAB Exams | 42.32| |Assin2 RTE | 91.33| |Assin2 STS | 80.69| |FaQuAD NLI | 79.80| |HateBR Binary | 87.99| |PT Hate Speech Binary | 68.09| |tweetSentBR | 61.11| ## Citação Se você deseja utilizar o Chatbode em sua pesquisa, cite-o da seguinte maneira: ``` @misc {chatbode_2024, author = { Gabriel Lino Garcia, Pedro Henrique Paiola and and João Paulo Papa}, title = { Chatbode }, year = {2024}, url = { https://huggingface.co/recogna-nlp/internlm-chatbode-7b/ }, doi = { 10.57967/hf/3317 }, publisher = { Hugging Face } } ```