File size: 1,083 Bytes
19d4726
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
<div style="text-align: center; max-width: 650px; margin: 0 auto;">
    <div>
        <h1 style="font-weight: 900; font-size: 3rem; margin: 20px;">
            PorPos tagger
        </h1>
        <p class="slogan">A Brazilian Portuguese part-of-speech tagger according to Universal
            Dependencies</p>
    </div>
    <p style="margin-top: 30px; margin-bottom: 10px; font-size: 94%; text-align: left;">
        PorPos (Porttinari Part-Of-Speech) tagger was trained on the <a
            href="https://sites.google.com/icmc.usp.br/poetisa/resources-and-tools">Porttinari-base</a> corpus which is
        a collection of news extracted from the Folha de São Paulo newspaper site. The trained model is a fine-tuned
        version
        of <a src="https://huggingface.co/neuralmind/bert-base-portuguese-cased">Bertimbau</a> that receives tokens and
        outputs part-of-speech tags. Since the model expects a sequence of
        tokens
        for its inputs, <a src="https://spacy.io/models/pt">Spacy's</a> tokenization is used to tokenize the input text.
    </p>
</div>