invoice_document_headers_extraction_with_donut

Running

App Files Files Community

to-be commited on Feb 13, 2023

Commit

f4ccbdc

•

1 Parent(s): a9a646c

Update app.py

Browse files

Files changed (1) hide show

app.py +3 -3

app.py CHANGED Viewed

@@ -60,9 +60,7 @@ def process_document(image):
     return processor.token2json(sequence), image
-description = '<p>Using Donut model finetuned on Invoices for retrieval of following information:</p><ul><li><span style="color:black">DocType</span></span></li><li><span style="color:black">Currency</span></span></li><li><span style="color:black">DocumentDate</span></span></li><li><span style="color:black">GrossAmount</span></span></li><li><span style="color:black">InvoiceNumber</span></span></li><li><span style="color:black">NetAmount</span></span></li><li><span style="color:black">TaxAmount</span></span></li><li><span style="color:black">OrderNumber</span></span></li><li><span style="color:black">CreditorCountry</span></span></li></ul><p>To use it, simply upload your image and click &#39;submit&#39;, or click one of the examples to load them. Read more at the links below.</p><p>&nbsp;</p><p>(because this is running on the free cpu tier, it will take about 40 secs before you see a result)</p><p>Have fun&nbsp;😎</p><p>Toon Beerten</p>'
-article = "<p style='text-align: center'><a href='https://arxiv.org/abs/2111.15664' target='_blank'>Donut: OCR-free Document Understanding Transformer</a> | <a href='https://github.com/clovaai/donut' target='_blank'>Github Repo</a></p>"
-title = "Demo: Donut 🍩 for invoice header retrieval"
 paragraph1 = '<p>Basic idea of this 🍩 model is to give it an image as input and extract indexes as text. No bounding boxes or confidences are generated.<br /> For more info, see the <a href="https://arxiv.org/abs/2111.15664">original paper</a>&nbsp;and the 🤗&nbsp;<a href="https://huggingface.co/naver-clova-ix/donut-base">model</a>.</p>'
 paragraph2 = '<p><strong>Training</strong>:<br />The model was trained with a few thousand of annotated invoices and non-invoices (for those the doctype will be &#39;Other&#39;). They span across different countries and languages. They are always one page only. The dataset is proprietary unfortunately.&nbsp;Model is set to input resolution of 1280x1920 pixels. So any sample you want to try with higher dpi than 150 has no added value.<br />It was trained for about 4 hours on a&nbsp;NVIDIA RTX A4000 for 20k steps with a val_metric of&nbsp;0.03413819904382196 at the end.<br />The <u>following indexes</u> were included in the train set:</p><ul><li><span style="font-family:Calibri"><span style="color:black">DocType</span></span></li><li><span style="font-family:Calibri"><span style="color:black">Currency</span></span></li><li><span style="font-family:Calibri"><span style="color:black">DocumentDate</span></span></li><li><span style="font-family:Calibri"><span style="color:black">GrossAmount</span></span></li><li><span style="font-family:Calibri"><span style="color:black">InvoiceNumber</span></span></li><li><span style="font-family:Calibri"><span style="color:black">NetAmount</span></span></li><li><span style="font-family:Calibri"><span style="color:black">TaxAmount</span></span></li><li><span style="font-family:Calibri"><span style="color:black">OrderNumber</span></span></li><li><span style="font-family:Calibri"><span style="color:black">CreditorCountry</span></span></li></ul>'
 #demo = gr.Interface(fn=process_document,inputs=gr_image,outputs="json",title="Demo: Donut 🍩 for invoice header retrieval", description=description,
@@ -77,6 +75,8 @@ css = "#inp {height: auto !important; width: 100% !important;}"
 with gr.Blocks(css=css) as demo:
     gr.HTML(paragraph1)
     gr.HTML(paragraph2)
     gr.HTML(paragraph3)

     return processor.token2json(sequence), image
+title = '<h1 style="text-align:center"><img alt="" src="circling_small.gif" />Welcome<img alt="" src="circling2_small.gif" /></h1>'
 paragraph1 = '<p>Basic idea of this 🍩 model is to give it an image as input and extract indexes as text. No bounding boxes or confidences are generated.<br /> For more info, see the <a href="https://arxiv.org/abs/2111.15664">original paper</a>&nbsp;and the 🤗&nbsp;<a href="https://huggingface.co/naver-clova-ix/donut-base">model</a>.</p>'
 paragraph2 = '<p><strong>Training</strong>:<br />The model was trained with a few thousand of annotated invoices and non-invoices (for those the doctype will be &#39;Other&#39;). They span across different countries and languages. They are always one page only. The dataset is proprietary unfortunately.&nbsp;Model is set to input resolution of 1280x1920 pixels. So any sample you want to try with higher dpi than 150 has no added value.<br />It was trained for about 4 hours on a&nbsp;NVIDIA RTX A4000 for 20k steps with a val_metric of&nbsp;0.03413819904382196 at the end.<br />The <u>following indexes</u> were included in the train set:</p><ul><li><span style="font-family:Calibri"><span style="color:black">DocType</span></span></li><li><span style="font-family:Calibri"><span style="color:black">Currency</span></span></li><li><span style="font-family:Calibri"><span style="color:black">DocumentDate</span></span></li><li><span style="font-family:Calibri"><span style="color:black">GrossAmount</span></span></li><li><span style="font-family:Calibri"><span style="color:black">InvoiceNumber</span></span></li><li><span style="font-family:Calibri"><span style="color:black">NetAmount</span></span></li><li><span style="font-family:Calibri"><span style="color:black">TaxAmount</span></span></li><li><span style="font-family:Calibri"><span style="color:black">OrderNumber</span></span></li><li><span style="font-family:Calibri"><span style="color:black">CreditorCountry</span></span></li></ul>'
 #demo = gr.Interface(fn=process_document,inputs=gr_image,outputs="json",title="Demo: Donut 🍩 for invoice header retrieval", description=description,
 with gr.Blocks(css=css) as demo:
+    gr.HTML(title)
     gr.HTML(paragraph1)
     gr.HTML(paragraph2)
     gr.HTML(paragraph3)