Record Streams - Input Streams, Output Streams, and Filtering

Record Types

Data is read into BIE using Record Input Streams (see RecordInputStream), and is sent out using Record Output Streams (see RecordOutputStream). A record is just like a C struct - it has a collection of fields with a name and a type with all names within in a record distinct. In addition our records order fields, meaning that referring to a field makes sense whether by name or by field index number. We call the descriptor for the names and types in a record a RecordType.

Record Input Streams

Record Input Streams are used to read a series of records from one or more sources. New record input streams can be created by reading from a data source (e.g. a file, or a network socket), by combining two streams, or modifying an existing stream. Inheriting streams through combination or modification constructs an implicit directed acyclic graph (DAG), in which the leaves are raw data sources, nodes with two children are where streams are combined, and nodes with only one child modify the stream. Data is always pulled from the 'root' stream in this tree. We require that a node is inherited at most once, meaning that only the 'root' node controls the flow of data.

Creating a new RecordInputStream from a file or socket

Currently, BIE the following input formats:

ASCII file, with or without a type header. Binary NetCDF files.

Combining two RecordInputStream objects

Modifying a RecordInputStream

Record Output Streams

Record output streams are used to write a series of records to one or more sources. Output streams are created by inheriting an existing stream. As with input streams, inheriting streams creates an implicit DAG However, in the case of output streams, leaves are data sinks and internal nodes apply operations. Data is pushed from the root node by the user/program to all inheriting streams.

When inherited, streams can be modified, processed by a filter, or the data can be output to a data sink (e.g. to a file or a network socket). Nodes can be inherited an unlimited number of times (with a few exceptions - see below).

Creating a new RecordOutputStream

Modifying a RecordOutputStream

field deletion, filtering, field selection, field renaming,

Connecting to pipes or network sockets.

The ASCII and binary record input and record output streams can be connected to network sockets or pipes.

Modifying a RecordOutputStream

Filters

Various kinds of filters can be applied to input and output streams to add fields (e.g. summary statistics) based on the data flowing through the stream.

Extending to other exchange formats

The Record Stream system was designed to support any data format that can be used to represent sequences of records.

Send suggestions, questions, and feedback to WEINBERG at ASTRO dot UMASS dot EDU.
Documentation generated at Fri Mar 26 00:35:11 2010 by doxygen