Kansallisarkisto/censusrecords-table-detection

Text column and row line intersection detection from Finnish census records from the 1930s

The model is trained to find the intersection points of table column and cell lines from digitized census record documents from the 1930s. The model has been trained using yolov8x by Ultralytics as the base model.

Intended uses & limitations

The model has been trained to detect intersection points from specific kinds of tables, and probably generalizes badly to other, very different table types.

Training data

Training dataset consisted of 218 digitized and annotated documents containing tables, while validation dataset contained 25 annotated document images.

Training procedure

This model was trained using 2 NVIDIA RTX A6000 GPUs with the following hyperparameters:

image size: 2560
initial learning rate (lr0): 0.00098
final learning rate (lrf): 0.01285
maximum number of detections per image (max_det): 500
train batch size: 2
epochs: 100
patience: 30 epochs
warmup_epochs: 3.91327
optimizer: AdamW
workers: 4
momentum: 0.90725
warmup_momentum: 0.72051
weight_decay: 0.00061
box loss weight (box): 9.34214
classification loss weight (cls): 0.34133
distribution focal loss weight (dfl): 1.83008
hue augment (hsv_h): 0.01126
saturation augment (hsv_s): 0.84221
brightness augment (hsv_v): 0.435
translation augment (translate): 0.11692
scale augment (scale): 0.45713
flip augment (fliplr): 0.38368
mosaic augment (mosaic): 0.77082

Default settings were used for other training hyperparameters (find more information here).

Model training was performed using the following code:

from ultralytics import YOLO

# Use pretrained Yolo segmentation model
model = YOLO('yolov8x.pt')                                                                                                               

# Path to .yaml file where data location and object classes are defined
yaml_path = 'intersections.yaml'

# Start model training with the defined parameters
model.train(data=yaml_path, name='model_name', epochs=100, imgsz=2560, max_det=500, workers=4, optimizer='AdamW', 
            lr0=0.00098, lrf=0.01285, momentum=0.90725, weight_decay=0.00061, warmup_epochs=3.91327, warmup_momentum=0.72051,
            box=9.34214, cls=0.34133, dfl=1.83008, hsv_h=0.01126, hsv_s=0.84221, hsv_v=0.435, translate=0.11692,
            scale=0.45713, fliplr=0.38368, mosaic=0.77082, seed=42, val=True, patience=30, batch=2, device='0,1')

Evaluation results

Evaluation results using the validation dataset are listed below:

Class	Images	Class instances	Box precision	Box recall	Box mAP50	Box mAP50-95
Intersection	25	10411	0.996	0.997	0.994	0.653

More information on the performance metrics can be found here.

Inference

If the model file huoneistokortit_13082024.pt is downloaded to a folder \models\ huoneistokortit_13082024.pt and the input image path is \data\image.jpg, inference can be perfomed using the following code:

from ultralytics import YOLO

# Initialize model
model = YOLO('\models\ huoneistokortit_13082024.pt')
prediction_results = model.predict(source='\data\image.jpg', save=True)

More information for available inference arguments can be found here.