SBIR-STTR Award

Active Learning System for Audit Selection
Award last edited on: 5/6/2019

Sponsored Program
STTR
Awarding Agency
NSF
Total Award Amount
$99,396
Award Phase
1
Solicitation Topic Code
-----

Principal Investigator
Daniele Micci-Barreca

Company Information

Elite Analytics LLC

Po Box 81326
Austin, TX 78759
   (512) 762-9668
   info@eliteanalytics.com
   www.eliteanalytics.com

Research Institution

----------

Phase I

Contract Number: ----------
Start Date: ----    Completed: ----
Phase I year
2006
Phase I Amount
$99,396
This research project aimes to develop, validate and bring to market an innovation that has the potential to dramatically enhance the return on investment from audit of fraud or non-compliance cases. In most audit detection domains, resource intensive evaluation of cases, such as costly audits, is the principal means of monitoring (and thus enhancing) compliance. To optimize the management of audit-related resources, statistical predictive models are often developed to detect cases of non-compliance. However, there exists a fundamental flaw in the existing paradigm of detection-model development, which significantly undermines the efficacy of non-compliance detection. The historical data used to induce the scoring models is heavily biased - it is drawn from "regions"in the search space that are already known to have relatively higher likelihood of incompliance. As a result, detection models fail to produce adequate predictions when applied to detect non-compliance in new regions of the domains. This flaw results in two important consequences: (1) detection models evolve slowly, if at all, to changes in non-compliance behavior and do not effectively detect new or existing unknown "pockets" of incompliance; and (2) information from new audit merely reinforce existing perceptions rather than enhance current knowledge. It is imperative to acquire information from unknown regions to produce better detection models. The goal of this project is to leverage intelligent sampling techniques from machine learning to help identify particularly informative audits that will substantially improve future audit detection and revenue recovery for a given cost. The proposed technology draws from recent advances in active learning research, which has demonstrated to produce substantially superior models for a given (audit) acquisition cost as compared to the existing sample-acquisition paradigm. Empirical results have shown impressive improvements in a variety of industry domains. Given that audit selection has important unique properties, this project would field validate the efficacy of active learning polices for the audit-detection domain, and perhaps develop customized new policies that better utilize the properties and objectives of the audit selection domain. We conjecture that these potentially risky hurdles have impeded the present deployment of ideas from active learning research to promote audit selection practices. From a product standpoint, the approach is to encapsulate the active learning technology (to be validated in Phase I) as a software system that integrates with current operational systems and business processes. The relevant industry domains to which this technology can be applied are broad and include tax auditing, insurance claims auditing, warranty fraud, benefits abuse, and e-commerce fraud. The economic impact of non-compliance is tremendous - it was estimated that the amount of uncollected IRS taxes in1992 was 127 billion dollars, and that Medicare lost $11.9 billion to fraud and mistakes in 2000 alone. Hence cost-effective detection of noncompliance can substantially benefit the US economy

Phase II

Contract Number: ----------
Start Date: ----    Completed: ----
Phase II year
----
Phase II Amount
----