File size: 1,937 Bytes
df14000 8e2429b fbc2006 8e2429b dd7ec51 61f1c49 8e2429b 01ae623 8e2429b 01ae623 8e2429b c31989f 8e2429b 01ae623 8e2429b 794fe8f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 |
---
datasets:
- e9t/nsmc
language:
- ko
metrics:
- accuracy
pipeline_tag: text-classification
---
## Model Description
- **Finetuned from model klue/bert :** (https://huggingface.co/klue/bert-base)
- i got **test_accuracy: 0.9041**
## Uses
- use to sentimental analysis task
## How to Get Started with the Model
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("seongyeon1/klue-base-finetuned-nsmc")
model = AutoModelForSequenceClassification.from_pretrained("seongyeon1/klue-base-finetuned-nsmc")
```
```python
from transformers import pipeline
pipe = pipeline("text-classification", model="seongyeon1/klue-base-finetuned-nsmc")
pipe("진짜 별로더라") # [{'label': 'LABEL_0', 'score': 0.999700665473938}]
pipe("굿굿") # [{'label': 'LABEL_1', 'score': 0.9875587224960327}]
```
## Training Details
### Training Data
- nsmc datasets (https://huggingface.co/datasets/e9t/nsmc)
```python
from datasets import load_dataset
dataset = load_dataset('nsmc')
```
#### Preprocessing
- bert's default is 512, but it costs a lot of time.
- maxlen = 55
![image/png](https://cdn-uploads.huggingface.co/production/uploads/634330a304d4ff28aeb8de56/t7axSlo4JI4bPLynUB3OP.png)
```python
def tokenize_function_with_max(examples, maxlen=maxlen):
encodings = tokenizer(examples['document'],max_length=maxlen, truncation=True, padding='max_length')
return encodings
```
#### Training Hyperparameters
- learning rate=2e-5, weight decay=0.01, batch size=32, epochs=2
#### Metrics
- **accuracy**
- label ratio is about almost balanced
![image/png](https://cdn-uploads.huggingface.co/production/uploads/634330a304d4ff28aeb8de56/_S5TTyec8I25Kx-yaqeJo.png)
#### Result
{'eval_loss': 0.2575262784957886,
'eval_accuracy': 0.9041,
'eval_runtime': 163.2129,
'eval_samples_per_second': 306.348,
'eval_steps_per_second': 9.576,
'epoch': 2.0}
|