A Novel Framework for Text Recognition in Street View Images

Mehmet Serdar Guzel


This paper addresses a new text recognition solution, which is mainly used in detection of street view images. This paper employs two different approaches to detect text based regions and recognize corresponding text fields.  The first approach utilizes Maximally Stable Extremal Regions (MSER), whereas the second approach relies on Class Specific Extremal Regions (CSER) algorithm. Two separate frameworks, designed with respect to the aforementioned methods, are applied to the street view images so as to extract text based regions.  Numerous experiments were performed to evaluate and compare both approaches. Especially results obtained from ERs based approach are quite encouraging and verify the system’s ability to detect text based regions and recognize corresponding text fields.


Text Recognition; MSER ; CSER; Signboard Detection; Street View Images

Full Text:

Submitted: 2017-05-02 12:58:39
Published: 2017-09-29 16:13:25
Search for citations in Google Scholar
Related articles: Google Scholar


R. D. Brown, “Example-based machine translation in the pangloss

system,” in Proceedings of the 16th International Conference on

Computational Linguistics, Copenhagen, DK, 1996, pp. 169-174.

Y. Cui and Q. Huang, “Character Extraction of License Plates from Video,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, San Juan, USA, 1997, pp. 502-507.

J. Gao and J. Yang, “An Adaptive Algorithm for Text Detection from Natural Scenes," in Proceedings of Computer Vision and Pattern Recognition, 2001, (CVPR 2001).

J. W. Hutchins, “Machine Translation: Past, Present,” Future, Ellis

Horwood Limited, England, 1986.

A. K. Jain and B. Yu, “Automatic text location in images and video

frames,” Pattern Recognition, vol. 31, no. 12, pp. 2055-2076, 1998.

K. S. Lahari, “Text Detection from Natural Image using MSER and BOW,” IJEERT, vol.3, pp. 152-156, 2015.

G. Nagaraju et al., “Text Extraction From Images With Edge-Enhanced Mser And Hardware Interfacing Using Arduino”, IJECS, vol.4, pp 11798-11803, 2015.

B. Epshtein et al., “Detecting text in natural scenes with stroke width transform,” in CVPR, CA, USA, pp. 2963-2970, 2010.

P. Shivakumara et al., “A Laplacian approach to multi-oriented text

detection in video,” IEEE Trans. Pattern Anal. Mach. Intell., vol.33,

pp. 412-419, 2011.

L. Neumann et al.,“Real-Time Scene Text Localization and Recognition,” in 25th IEEE Conference on Computer Vision and Patter Rec. , RI, USA, 16-22 June, 2012.

G. Li et al., “Scene text detection with extremal region based cascaded filtering,” in IEEE ICP Conference, Phoenix, AZ, USA, pp. 2896-2900, 25-28 Sept. 2016.

J. Matas et al., “A new class of learnable detectors for categorisation,” In Image Analysis, vol. 3540 of LNCS, pp. 541-550, 2005.

K. R. Muller et al., “An introduction to kernel-based learning algorithms,” IEEE Trans. on Neural Networks, vol. 12, pp. 181-201, 2001.

Abstract views:


Copyright (c) 2017 International Journal of Intelligent Systems and Applications in Engineering

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
© Prof.Dr. Ismail SARITAS 2013-2018     -    Address: Selcuk University, Faculty of Technology 42031 Selcuklu, Konya/TURKEY.