Semi-automatic Segmentation & Alignment of Handwritten

Yüklə 11,83 Mb.

səhifə	6/23
tarix	07.09.2023
ölçüsü	11,83 Mb.
	#141855

1 2 3 4 5 6 7 8 9 ... 23

Automatic Alignment of Handwritten Images and Transcripts for Training Handwritten Text Recognition Systems

Transkribus

Transkribus (READ-COOP 2023) is perhaps one of the most popular comprehensive platforms for transcribing texts and searching historical documents. Through the use of AI, Transkribus can be used to allow the training of HTR models from GT information or use publicly available pre-trained HTR models. There are over 90 free public AI models available. The platform’s workflow includes uploading digitized images, layout analysis tools used for segmenting text images into text lines, and tools for transcribing the lines. Although training a model requires large amounts of annotated data, the tool was able to generate a model with noteworthy results, with certain instances achieving a character error rate of approximately 5 % (Souibgui et al. 2022).

eScriptorium

eScriptorium (AOrOc-laboratory 2023) is an open-source platform for transcription based on Kraken, an HTR system based on neural networks. This platform includes tools and features for layout analysis and is specifically designed to be a general tool for transcribing historical manuscripts of any language. eScriptorium, as Transkribus, poses the same limitation, as the methods used are based on deep learning; it requires a large amount of annotated data to make accurate models for new manuscripts (Souibgui et al. 2022).

Automatic Alignment of Handwritten Images and Transcripts for Training Handwritten Text Recognition Systems

In (Romero-Gómez et al. 2018), the authors present a method for automatically align- ing handwritten text images and their respective transcripts. The approach is split into layout analysis and alignment. The layout analysis consists of text block detection fol- lowed by text line detection and optimisation. The text block detection is accomplished through a procedure based on horizontal and vertical line detection. The text line detec-

tion is based on hidden Markov Models and finite state or N-gram vertical layout models. Once the text lines have been segmented, these line images are used in the alignment. Their proposed alignment method uses dynamic programming to achieve the best align- ment between a given line image sequence and line-level transcripts. In this method, a line image is represented by the best character string hypotheses acquired by decod- ing the image of the line with an HMM-based HTR recognizer. This representation can be directly compared with transcript character strings. A cost for each transcript/line representation is computed using the Levenshtein distance metric at the character level.

Yüklə 11,83 Mb.

Dostları ilə paylaş:

1 2 3 4 5 6 7 8 9 ... 23