New approaches to handling streaming data are required. By the time one wishes to index stored scientific data, critical decisions about what to store have already been made. We propose evaluating, extending, and implementing a suite of algorithms that use new indexing and sampling techniques to improve analytical performance on large heterogenous streams of scientific data beyond current state of the art.
Keywords: Algorithms, Streaming Algorithms, Sampling, Indexing, Cloud Computing, Big Data, Distributed Computation, Machine Learning