SBIR-STTR Award

Codebook Correlation for Self-Healing Networked Systems
Award last edited on: 1/24/06

Sponsored Program
SBIR
Awarding Agency
NSF
Total Award Amount
$372,217
Award Phase
2
Solicitation Topic Code
-----

Principal Investigator
Shaula Yemini

Company Information

System Management Arts Inc (AKA: SMARTS)

44 South Broadway 7th Floor
White Plains, NY 10601
   (914) 948-6200
   N/A
   www.smarts.com
Location: Multiple
Congr. District: 17
County: Westchester

Phase I

Contract Number: 9461773
Start Date: 00/00/00    Completed: 00/00/00
Phase I year
1994
Phase I Amount
$74,745
This Small Business Innovation Research Phase I project focuses on applying coding theory techniques to automate alarm correlation in complex networks. Alarms indicate exceptional network states or behaviors, e.g., failure or congestion, which require immediate handling to avoid disrupting network operations. In a networked system, a single problem can result in large numbers of alarms manifested by multiple system components. Alarm correlation correlates these various symptoms to accurately identify the problems requiring handling, and is thus a central component in network operations and management (OAM). As enterprises' reliance on networked systems grows, OAM consumes an increasingly higher percentage of information technology budgets, currently estimated at 65-90%, mainly due to high labor costs. Automation of alarm correlation can substantially reduce OAM costs while improving their quality. System Management Arts, Inc., anticipates that techniques based on coding theory can yield up to 2 orders of magnitude improvement in alarm correlation performance over today's techniques, while increasing both their accuracy and robustness.

Phase II

Contract Number: 9633004
Start Date: 00/00/00    Completed: 00/00/00
Phase II year
1996
Phase II Amount
$297,472
Network event correlation's goal is to identify a networked system's problems before they disrupt applications. Current approaches to automating event correlation are slow, sensitive to noise in their data, and cannot dynamically adapt to a networked system's topology changes. The Codebook approach uses coding theory as the base of a novel correlation process that is high speed, robust to noise, and dynarnically adapts to system changes. The key idea is that since problems are characterized by the symptoms they cause, it suffices to monitor the minimal set of symptoms that uniquely identify ("encode"") the problems of interest. Real-time correlation processing is thus reduced to a fast process of minimal distance "decoding" of symptoms. Symptoms can be added to make codes "error-correcting", for robustness with respect to noise in the event stream. Potential market areas for this project are: 1.automated problem management products for enterprise networks, reducing exposure of mission critical applications to network down time. 2. Automated network management solutions for network service providers, increasing quality of service while reducing operations costs. 3. Selfhealing network, system and application.