torch transformers huggingface_hub gdown pymupdf unidecode pdf2image poppler-utils datasets vncorenlp accelerate pytorch-crf==0.7.2 sklearn-crfsuite