Keyword spotting in handwritten chinese documents using semi-markov conditional random fields
Feb 23, 2017Author:
Title: Keyword spotting in handwritten chinese documents using semi-markov conditional random fields Authors: Zhang, H; Zhou, XD; Liu, CL Author Full Names: Zhang, Heng; Zhou, Xiang-Dong; Liu, Cheng-Lin Source: ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 58 49-61; 10.1016/j.engappai.2016.11.006 FEB 2017 Language: English Abstract: This paper proposes a document indexing method for keyword spotting based on semi-Markov conditional random fields (semi-CRFs), which provide a theoretical framework for fusing the information of different contexts. The candidate segmentation-recognition lattice is first augmented based on the linguistic context to improve recognition results. For fast retrieval and to save storage space, the lattice is then purged by a forward backward pruning procedure. In the reduced lattice, we estimate character similarity scores based on the semi-CRF model. The parameters of semi-CRF model are estimated using a binary classification objective, i.e., the cross-entropy (CE) to discriminate candidate characters in the lattice. To locate mis-recognized character instances in the lattice, we use confusing similar characters as proxies and search for proxy-characters in the index file. The proxy-character driven search can significantly improve the performance compared with our previous character-synchronous dynamic search (CSDS) method. Experimental results on the online handwriting database CASIA-OLHWDB justify the effectiveness of the proposed method. ISSN: 0952-1976 eISSN: 1873-6769 IDS Number: EI7NF Unique ID: WOS:000392684200004