SBIR-STTR Award

Approaches and Techniques for Specialized Characte
Award last edited on: 7/2/2010

Sponsored Program
SBIR
Awarding Agency
DOD : Army
Total Award Amount
$69,961
Award Phase
1
Solicitation Topic Code
A09-042
Principal Investigator
John Chen

Company Information

Janya Inc

1408 Sweet Home Road Suite 1
Amherst, NY 14228
   (716) 565-0401
   rohini@janyainc.com
   www.janyainc.com
Location: Multiple
Congr. District: 26
County: Erie

Phase I

Contract Number: ----------
Start Date: ----    Completed: ----
Phase I year
2010
Phase I Amount
$69,961
The ability to rapidly spot named entities (NEs) such as persons, organizations, and locations in Arabic document image data is of strategic and tactical importance. An NE extraction system that performs this task faces numerous challenges. These include dealing with images representing both handwritten and character text, images where Arabic and Romanized scripts are mixed, and images of poor quality. Indeed, experiments on combined character recognition (CR) and NE extraction systems show that NE extraction performance degrades twice as fast as CR performance as more noise is introduced into the input images. The goal of this project is to develop a high-accuracy CR and NE extraction system whose input consists of images of Arabic text. Our approach is to perform CR and NE in a pipeline, with the CR component passing multiple best hypotheses to the NE extraction system. Joint inference over these multiple hypotheses are performed using $k$-best or approximate inference methods, improving overall system accuracy.

Keywords:
Information Extraction, Named Entity Recognition, Approximate Inference, Particle Filtering, Character Recognition, Handwriting Recognition, Document Image Processing, Pattern

Phase II

Contract Number: ----------
Start Date: ----    Completed: ----
Phase II year
----
Phase II Amount
----