Analytics are performed by a configuration of Stages in either a Pipeline or Data Flow layout.
A Pipeline is a linear arrangement of sequentally executed stages in which the output of one stage is piped as the input into the next stage. An input data source feeds records into the first stage of the pipeline. Each stage then sequentially performs a transformation of its input data (e.g., filters, aggregates, sorts) and sends its output data along to the next stage. The output of the final stage is returned to be used in a table or other visualization.
A Data Flow is a more flexible layout of stages that allows branching and collecting. A data flow is drawn as a two dimensional diagram with connector line(s) between stages. A stage may send its output to multiple stages, and some stages (i.e., Stack and Join) require multiple inputs. Unlike a pipeline, a data flow can utilize multiple input data sources, and can return multiple output record sets.
Stages are the building blocks of Metric Studios analytics. Each stage performs a simple, well-defined task on a set of input records, and returns a set of output records. A few stages (i.e., Stack and Join) require two sets of input records, and thus should be used in a data flow vs. a pipeline.
Records are perhaps best understood as rows in a table. Each row has multiple data cells, one for each column of the table. When we refer to a set of records, it may be easy to just picture a table of data with rows and columns.
Stages are defined using Syntax. We provide user interfaces when possible to make the creation of syntax easy. It is important in particular to understand the syntax for Measures and Groups.