--- license: apache-2.0 --- model base: https://huggingface.co/microsoft/mdeberta-v3-base dataset: https://github.com/ramybaly/Article-Bias-Prediction training parameters: - devices: 2xH100 - batch_size: 100 - epochs: 5 - dropout: 0.05 - max_length: 512 - learning_rate: 3e-5 - warmup_steps: 100 - random_state: 239 training methodology: - sanitize dataset following specific rule-set, utilize random split as provided in the dataset - train on train split and evaluate on validation split in each epoch - evaluate test split only on the model that performed best on validation loss result summary: - throughout the five training epochs, model of second epoch achieved the lowest validation loss of 0.2573 - on test split second epoch model achieved f1 score of 0.9184 and a test loss of 0.2904 usage: ``` model = AutoModelForSequenceClassification.from_pretrained("premsa/political-bias-prediction-allsides-mDeBERTa") tokenizer = AutoTokenizer.from_pretrained("premsa/political-bias-prediction-allsides-mDeBERTa") nlp = pipeline("text-classification", model=model, tokenizer=tokenizer) print(nlp("die massen werden von den medien kontrolliert.")) ```