Acta Univ. Agric. Silvic. Mendelianae Brun. 2010, 58, 421-432

https://doi.org/10.11118/actaun201058060421
Published online 2014-07-17

Experimental determination of chosen document elements parameters from raster graphics sources

Jiří Rybička1, Dagmar Kelnarová1, Petra Talandová2

1Ústav informatiky, Mendelova univerzita v Brně, Zemědělská 1, 613 00 Brno, Česká republika
2Ústav informatiky, Mendelova univerzita v Brně, Zemědělská 1, 613 00 Brno, Česká republika

Visual appearance of documents and their formal quality is considered to be as important as the content quality. Formal and typographical quality of documents can be evaluated by an automated system that processes raster images of documents. A document is described by a formal model that treats a page as an object and also as a set of elements, whereas page elements include text and graphic object. All elements are described by their parameters depending on elements’ type. For future evaluation, mainly text objects are important. This paper describes the experimental determination of chosen document elements parameters from raster images. Techniques for image processing are used, where an image is represented as a matrix of dots and parameter values are extracted. Algorithms for parameter extraction from raster images were designed and were aimed mainly at typographical parameters like indentation, alignment, font size or spacing. Algorithms were tested on a set of 100 images of paragraphs or pages and provide very good results. Extracted parameters can be directly used for typographical quality evaluation.

References

10 live references