Text Classification
TF-Keras
English
File size: 3,035 Bytes
9d13356
a1a7cb5
4f4d579
9c9f25f
1b01a8e
a1a7cb5
8a7f3c1
1b01a8e
8a7f3c1
1b01a8e
8a7f3c1
1b01a8e
 
 
 
 
 
 
 
 
36365ca
9d13356
 
36365ca
 
 
 
8a7f3c1
 
 
36365ca
 
 
 
 
 
 
 
 
 
 
 
 
 
9d13356
36365ca
9d13356
36365ca
9d13356
36365ca
9d13356
36365ca
9d13356
36365ca
 
 
9d13356
36365ca
9c9f25f
36365ca
ffacd48
36365ca
 
9d13356
36365ca
9d13356
36365ca
 
9d13356
36365ca
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
---
library_name: tf-keras
pipeline_tag: text-classification
widget:
- text: Electronic cigarettes (also known as vapes, vaporizers, or vape pens) were introduced into the US market in 2007.
  output:
  - label: Establishing a Research Territory
    score: 0.9
  - label: Establishing a Niche
    score: 0.05
  - label: Occupying the Niche
    score: 0.05
license: mit
datasets:
- stormsidali2001/IMRAD-introduction-sentences-moves-sub-moves-dataset
language:
- en
metrics:
- f1
- accuracy
base_model: google/bert-base-cased
---

## IMRaD Introduction Move Classifier

This model is a fine-tuned BERT model designed to classify sentences from the introductions of scientific research papers into one of three IMRaD moves:

* **Establishing a Research Territory:** Setting the context and background information for the research.
* **Establishing a Niche:** Identifying a gap or problem in existing research.
* **Occupying the Niche:** Proposing a solution or approach to address the identified gap.

## Intended Uses & Limitations

**Intended Uses:**

* **Scientific Writing Assistance:** Help researchers and students analyze and improve the structure of their introductions by identifying the IMRaD moves present in each sentence.
* **Literature Review Analysis:**  Assist in quickly understanding the rhetorical structure of introductions in a set of research papers.
* **Educational Tool:** Illustrate IMRaD concepts and their practical application in scientific writing. 

**Limitations:**

* **Domain Specificity:** The model was trained on a dataset of scientific research papers and might not perform as well on other types of text.
* **Accuracy:** While the model achieves good accuracy, it's not perfect. Predictions should be reviewed carefully, especially in complex or ambiguous sentences.
* **Sentence-Level Classification:**  The model classifies individual sentences. It does not provide an overall analysis of the entire introduction.

## Training and Evaluation Data

The model was trained and evaluated on the "IMRAD Introduction Sentences Moves & Sub-moves Dataset" available on Hugging Face: [https://huggingface.co/datasets/stormsidali2001/IMRAD-introduction-sentences-moves-sub-moves-dataset](https://huggingface.co/datasets/stormsidali2001/IMRAD-introduction-sentences-moves-sub-moves-dataset)

The dataset consists of sentences extracted from scientific research paper introductions, manually labeled with their corresponding IMRaD moves.

**Training Details:**

* The `bert-base-cased` model from Google was used as the base model.
* Fine-tuning was performed using a TensorFlow/Keras implementation. 
* Evaluation metrics include F1 score and accuracy.

## How to Use

You can use this model with the `pipeline` function from the `transformers` library:

```python
from transformers import pipeline

classifier = pipeline("text-classification", model="your-username/your-model-name")

sentence = "Electronic cigarettes were introduced into the US market in 2007."
result = classifier(sentence)

print(result)