--- license: apache-2.0 datasets: - sentiment140 language: - en library_name: transformers pipeline_tag: text-classification widget: - text: "I liked this movie" output: - label: PROBABILITY POSITIVE score: 0.8 --- ## Model Description TweeBERTa is a fine-tuned version of the RoBERTa base model, specifically tailored for sentiment analysis tasks. This model has been trained on the Sentiment140 dataset, making it highly effective in understanding and categorizing sentiments expressed in text, particularly within the context of social media. ## Training and Evaluation ### Training Data The model was trained on the Sentiment140 dataset, which is a popular dataset for sentiment analysis, especially in the context of tweets. ### Training Procedure - **Loss Function:** Binary Cross Entropy Loss - **Optimizer:** Adam Optimizer - **Learning Rate Schedule:** Linear decrease, starting at 1e-5 and ending at 1e-7 - **Epochs:** The model was trained for a total of 10 epochs, split into two cycles of 5 epochs each, with the same learning rate cycle for both. ### Performance The model achieved the following metrics on the evaluation set: - **Precision:** 0.8328 - **Recall:** 0.8687 - **F1 Score:** 0.8504 - **Accuracy:** 0.8471 ## How to Use This model is ideal for sentiment analysis tasks, particularly in the context of social media and short text snippets. It can be used directly through the transformers library. An example usage is provided in the widget section of this card. ## Limitations and Bias While the model shows high performance on the Sentiment140 dataset, it may not generalize as well to texts from different domains or those that contain complex or subtle expressions of sentiment. Users should also be aware of potential biases inherent in the training data, which may be reflected in the model's predictions. ## Ethical Considerations This model should be used responsibly, considering potential biases and the impact of automated sentiment analysis in various applications, particularly those affecting human decision-making. ## Acknowledgements This model was fine-tuned and evaluated by Aditya Patkar. The base RoBERTa model and the Sentiment140 dataset were important in developing this model. The training notebook along with a comprehensive comparitive analysis of different models on Sentiment140 dataset can be found at https://github.com/adityapatkar/SentimentSifter. ---