OCR on image
#28
by
glitchyordis
- opened
Obtaining key information is quite straightforward but Is there a way to obtain bbox locations from texts detected?
glitchyordis
changed discussion title from
OCR text
to OCR on image
You can prompt the model to return bbox locations (see here: https://huggingface.co/spaces/maxiw/Qwen2-VL-Detection). I also tried "detect all texts" but the results are not super precise.
I tried OCR on a not-that-clear text screenshot, it's working nearly perfectly. But the model seems not good at recognize twisted text. E.g. words on bottle.