SBIR-STTR Award

Large-scale Entity Linking and Disambiguation with DeepDive
Award last edited on: 2/7/2019

Sponsored Program
STTR
Awarding Agency
DOD : Navy
Total Award Amount
$79,870
Award Phase
1
Solicitation Topic Code
N16A-T016
Principal Investigator
Michael Cafarella

Company Information

Lattice Data Inc (AKA: Clearcut Analytics, Inc)

460 California Avenue
Palo Alto, CA 94306
   (847) 436-4044
   N/A
   lattice.io

Research Institution

Stanford University

Phase I

Contract Number: N00014-16-P-2049
Start Date: 00/00/00    Completed: 00/00/00
Phase I year
2016
Phase I Amount
$79,870
DeepDive is a system for extracting relational databases from dark data: the mass of text, tables, and images that are widely collected and stored but which cannot be exploited by standard relational tools. If the information in dark data --- scientific papers, Web classified ads, customer service notes, and so on --- were instead in a relational database, it would give analysts access to a massive and highly-valuable new set of ``big data'' to exploit. In this proposal, we will describe our plan to enhance the data (as well as the extractions) by linking and disambiguating textual mentions (noun phrases) to their real-world entities, which enables analysis --- never before possible --- with much richer knowledge extracted from text. The main technical challenges are 1) how to efficiently disambiguate an entity mention to one of millions of entities in a typical knowledge base (e.g., Wikipedia); 2) how to resolve ambiguity if the real-world entity is absent from the input knowledge bases; 3) how to effectively leverage contextual information to make accurate link predictions. We will present designs of entity linking and resolution systems to resolve these issues.

Phase II

Contract Number: ----------
Start Date: 00/00/00    Completed: 00/00/00
Phase II year
----
Phase II Amount
----