SBIR-STTR Award

Auto-Transcription for Citizen Science
Award last edited on: 6/10/22

Sponsored Program
SBIR
Awarding Agency
DOC : NOAA
Total Award Amount
$650,000
Award Phase
2
Solicitation Topic Code
9.5
Principal Investigator
Steven Minton

Company Information

Inferlink Corporation

2361 Rosecrans Avenue Suite 348
El Segundo, CA 90245
   (310) 341-2446
   inquiry@inferlink.com
   www.inferlink.com
Location: Single
Congr. District: 36
County: Los Angeles

Phase I

Contract Number: NA21OAR0210488
Start Date: 9/1/21    Completed: 2/28/22
Phase I year
2021
Phase I Amount
$150,000
We propose to develop a system for transcribing semi-structured data, including tables, from handwritten or typed document images. Our objective is to design an end-to-end system that takes document images as inputs and extracts a digital, tabular output. The proposed approach employs a joint reasoning architecture where optical character recognition and structure recognition is combined in a single neural network to achieve high accuracy. In addition, our approach will allow users to rapidly fix any extraction mistakes in a “example-based” manner, so that working as a man-machine team the complete task can be efficiently accomplished, even in cases where the system does not initially produce a perfect result. Our aim is to empower citizen scientists to quickly and easily participate in the extraction process in an intuitive w

Phase II

Contract Number: NA22OAR0210491
Start Date: 8/1/22    Completed: 7/31/24
Phase II year
2022
Phase II Amount
$500,000
We propose to implement a system for automatic transcription of tables from handwritten or typed document images. In phase I we developed an end-to-end system design and showed the feasibility of our approach with a working prototype to take document images as inputs and extract a digital, tabular output. In phase II we will fully implement the end to-end system. Our work will focus on three tasks. The first is the implementation of the front-end user interface, which will allow users to create “templates” for transcription jobs, provide guidance, and correct the output. The second task focuses on implementing the back end, which employs highly accurate models that can incorporate user knowledge about the tables to be transcribed. Finally, in addition to the implementation work, we will evaluate the system’s ultimate accuracy, deploy the system, conduct user testing, and refine the implementation in response to feedback. This man-machine combination of user input, and models capable of using the input, will allow transcription to be efficiently and accurately accomplished, even in cases where the system does not initially produce a perfect result. Our system will empower citizen scientists to accomplish transcription tasks quickly, intuitively, and ea