SBIR-STTR Award

Exact Inference Software for Correlated Categorical Data
Award last edited on: 5/12/08

Sponsored Program
SBIR
Awarding Agency
NIH : NCRR
Total Award Amount
$890,815
Award Phase
2
Solicitation Topic Code
-----

Principal Investigator
Pralay Senchaudhuri

Company Information

Cytel Software Corporation (AKA: Cytel, Inc)

675 Massachusetts Avenue #3
Cambridge, MA 02139
   (617) 661-2011
   info@cytel.com
   www.cytel.com
Location: Single
Congr. District: 05
County: Middlesex

Phase I

Contract Number: 1R43RR019052-01
Start Date: 00/00/00    Completed: 00/00/00
Phase I year
2004
Phase I Amount
$100,459
This is a Phase I SBIR proposal for the development of computer software performing small sample exact inference for correlated categorical data. Such data are now common in many, biomedical areas of research, such as genetics, ophthalmology, and teratology. In addition, exact methods are not available in any commercial package and are badly needed for accurate inference. By the end of Phase II we plan to develop tools for analyzing small as well as large data sets with correlated binary and correlated multivariate responses. This set of tools will compute point estimates and confidence intervals, as well as perform hypothesis tests, for several likelihood-based models for a multivariate binary response. In this Phase I effort, we will 1. build a stand-alone program with a simple user interface, including a data editor and a menu to specify fixed covariates and correlation structure of the data; 2. implement within this computer program a) an exact trend test for stratified ordered clustered binomial populations, b) an exact procedure to test for clustering c) an exact trend test for multiple outcome data In addition, we will investigate the feasibility of network-based Monte Carlo methods for fitting such models. Because there exists no commercial statistical software for such methods for exact inference, the resulting module will fill an important gap in statistical tools for categorical data analysis, and will be incorporated into new versions of the StatXact and LogXact, Cytel's flagship software packages, we shall also implement all these tools in widely used statistical package SAS as a SAS procedure.

Thesaurus Terms:
computer program /software, computer system design /evaluation, data management, method development computer human interaction, computer network, information system, statistics /biometry clinical research

Phase II

Contract Number: 2R44RR019052-02
Start Date: 00/00/00    Completed: 00/00/00
Phase II year
2005
(last award dollars: 2006)
Phase II Amount
$790,356

This is a Phase II SBIR proposal for a major extension to Cytel's flagship software, StatXact, to perform small sample exact inference for correlated categorical data. Such data are common in biomedical research, especially in areas such as genetics, ophthalmology, and developmental toxicology. In this Phase II effort, we will develop correlated data analogues for most of the existing small sample procedures for independent data currently provided in the StatXact software. Specifically the resulting module will implement correlated data extensions of: 1) Exact tests of independence in unordered and ordered R x C contingency table. 2) Tests for differences in the distributions of two ordered multinomial populations. 3) Exact Mantel-Haenszel-type tests for assessing homogeneity of relative odds for stratified 2x2 tables, 4) Exact correlated data methods for situations in which independent factors vary across observations within a cluster. This extension will expand the applicability of such methods to a wider range of longitudinal and multiple outcome settings. Because implementing the above procedures can be computationally complex, a final goal of the proposal is the development of new algorithms to make these tools practical for general use. These include new efficient network-based algorithms and Monte Carlo simulation strategies for model fitting. The final product of this effort will be a toolbox of exact procedures will enable users to avoid relying on potentially poor large sample approximations when analyzing small sample correlated categorical data. This advantage will ultimately lead to more reliable analyses of such data. There is currently no software for such methods other than a limited prototype developed in Phase I of this proposal