Skip to main content

Aggregate

The aggregate stage calculates aggregation functions (e.g., mean, percent, freq, and more) on the input records and returns a consolidated set of records (in total and/or by comparison groups).

Stage Editor UI

Aggregate

Stage as JSON

{
$aggregate: {
// note: there must exist one (and only one) measures axis, located in either rows or cols
rows: <array of axes>,
cols: <array of axes>
}
}

Measures Axis

There is only one Measures Axis and it must have at least one Measure defined. In the stage UI, click the Axis button to edit measures.

Learn about the Measures Axis.

Groups Axes

The Groups Axes are optional. Measures are calculated for each group (data cut) in a Groups Axis. If multiple Groups Axes exist, a nested expansion of all groups axes is output.

Click here to learn about the Groups Axis.

Layout

Layout determines where each Axis is returned: as rows or columns, and in what order.

In the UI, you may click and drag an axis to re-order within rows or columns container, or move it into the opposite container.

Click here to learn about Layout (todo).

Examples:

Click section to expand.

Input records

Input records

respidq1segcountrywave
18111
22151
34132
49211
51221
...............
10003442
Example: One measure

Aggregation stage

{
$aggregation: {
cols: [
{
// type: "measures", (assumed)
measures: {
m1: "mean(q1)"
}
}
]
}
}

Output records:

m1.valm1.n
3.5945
Example: One measure and groups

Aggregation stage with groups:

{
$aggregation: {
cols: [
{
// measures dimension
measures: {
m1: "mean(q1)"
}
}
],
rows: [
{
// this is an "any" group dimension (doesn't specify exact groups)
id: "segment",
syntax: "segment"
}
],
}
}

Output records:

segmentm1.valm1.n
12.5250
21.7250
34.3250
42.9250
Example: One measure and specific groups

Aggregation stage with specified groups:

{
$aggregation: {
cols: [
{
measures: {
m1: "mean(q1)"
}
},
]
rows: [
{
id: "segment",
groups: {
{ syntax: "segment==1", label: "Segment one" },
{ syntax: "segment==2", label: "Segment two" }
{ syntax: "segment in (3,4)", label: "Segment three or four" }
}
}
],
}
}

Output records:

segment.idsegment.syntaxsegment.labelm1.valm1.n
seg1segment==1Segment one2.5250
seg2segment==2Segment two1.7250
seg_othersegment in (3,4)Segment three or four3.5500
Example: Measures and multiple group dims

Aggregation stage more complex example

danger

Todo: I changed the syntax.

{
$aggregation: {
measures: {
m1: { syntax: "pct(q1 in (1,2,3))", label: "Q1 Bottom Box" },
m2: { syntax: "pct(q1 in (4,5,6,7))", label: "Q1 Indifferent" },
m3: { syntax: "pct(q1 in (8,9))", label: "Q1 Top Box" },
},
groups: {
country: "country",
segment: {
seg1: "segment==1",
seg2: "segment==2"
},
wave: {
w1: { syntax: "wave==1", label: "Last Year" },
w2: { syntax: "wave==2", label: "YTD" },
}
},
layout: {
cols: ["segment"],
rows: ["country", "wave", "m"]
}
}
}

Output records:

note

Some output fields have been hidden from the view below: "wave.id", "wave.syntax", "m.id" and "m.syntax" would also return.

countrywave.labelm.labelseg1.valseg1.nseg2.valseg2.n
CanadaLast YearQ1 Bottom Box(value)(n)(value)(n)
CanadaLast YearQ1 Indifferent(value)(n)(value)(n)
CanadaLast YearQ1 Top Box(value)(n)(value)(n)
CanadaYTDQ1 Bottom Box(value)(n)(value)(n)
CanadaYTDQ1 Indifferent(value)(n)(value)(n)
CanadaYTDQ1 Top Box(value)(n)(value)(n)
SpainLast YearQ1 Bottom Box(value)(n)(value)(n)
SpainLast YearQ1 Indifferent(value)(n)(value)(n)
SpainLast YearQ1 Top Box(value)(n)(value)(n)
SpainYTDQ1 Bottom Box(value)(n)(value)(n)
SpainYTDQ1 Indifferent(value)(n)(value)(n)
SpainYTDQ1 Top Box(value)(n)(value)(n)
Full Code Reference (scratch - don't read me)
danger

Ignore this stuff.

Stage syntax Reference

The aggregate stage takes 3 parameters:

  • measures (required) - takes one input dimension that contains the measures.
  • groups (optional) - defines grouping dimensions if any
  • layout (required) - specifies where to put the dimensions (in rows vs cols) and in what order
{
$aggregate: {
measures: <measures-definition>
groups: <groups-definition>
layout: <layout-definition>
}
}

<measures-definition>

The short form does not specify the id of the measures axis, so an id of "m" is assumed.

measures: {
m1: <measure>, // "m1" is the id of the measure
m2: <measure>,
m3: <measure>,
}

The longer form gives the measures axis an id

measures: {
m: { // m is the name of the axis
m1: <measure>,
m2: <measure>,
m3: <measure>,
}
}

Or should I use this form instead?

measures: {
_id: "m", // m is the name of the axis
m1: <measure>,
m2: <measure>,
m3: <measure>,
}

<measure>

A measure can be defined as a string if only the syntax is needed.

"pct(q1==1)"

If a measure needs a label or any other information, an object can be used.

{ syntax: "pct(q1==1)", label: "Satisfied", something_else: "#0064c8" }

Example usage:

measures: {
m1: "pct(q1==1)",
m2: { syntax: "pct(q1==2)", label: "satisfied", weight: "w1" }
}

<group>

Define groups

groups: {
g: "country", // this is a stardim because it is a string and contains syntax only
h: {
h1: "segment==1", // shorthand (syntax only)
h2: "segment==2",
h3: { syntax: "segment==3", label: "Outdoor consumer" }
}
}

Define layout

If layout is missing, what about this??

  • Measures will go in columns IF no groups are specified
  • If any (regular?) groups are specified, Measures will go in rows
  • If any stardim groups are specified, they will go in rows
  • All other group dims will go in columns
  • Or should I just require layout?
layout: {
rows: ["m", "g"], // g cannot be cols since it's a stardim
cols: ["h"]
}

Do a mongodb $group

{
$aggregate: {
$groups: {
country: "country",
segment: "segment"
},
$measures: {
avg_q1: "avg(q1)",
avg_q2: "avg(q2)"
}
// layout missing, so according to rules above,
// measures will be columns
// stardims will be rows
// seems pretty close to mongodb syntax
}
}

Enhancing MongoDB group

{
$aggregate: {
$groups: {
country: "country",
segment: "segment",
own: { // specify exact groups (overlapping, from different vars)
own_honda: { syntax: "own_01==1", label: "Own Honda" },
own_toyota: { syntax: "own_02==1", label: "Own Toyota" },
own_ford: { syntax: "own_03==1", label: "Own Ford" },
}
},
$measures: {
avg_q1: "avg(q1)",
avg_q2: "avg(q2)"
}
// layout missing, so according to rules above,
// measures will be columns
// stardims (country, segment) will be rows
// groupdims will be cols?
}
}

| country | segment | own_honda | own_toyota | own_ford | | country | segment | avg_q1 | avg_q2 | avg_q1 | avg_q2 | avg_q1 | avg_q2 | | ------- | ------- | ------ | ------ | ------ | ------ | ------ | ------ | | USA | Seg 1 |{val, n}|{val, n}|{val, n}|{val, n}|{val, n}|{val, n}| | USA | Seg 2 |{val, n}|{val, n}|{val, n}|{val, n}|{val, n}|{val, n}| | USA | Seg 3 |{val, n}|{val, n}|{val, n}|{val, n}|{val, n}|{val, n}| | Canada | Seg 1 |{val, n}|{val, n}|{val, n}|{val, n}|{val, n}|{val, n}| | Canada | Seg 2 |{val, n}|{val, n}|{val, n}|{val, n}|{val, n}|{val, n}| | Canada | Seg 3 |{val, n}|{val, n}|{val, n}|{val, n}|{val, n}|{val, n}|

mcountrysegmentown_hondaown_toyotaown_ford
avg_q1USASeg 1{val, n}{val, n}{val, n}
avg_q1USASeg 2{val, n}{val, n}{val, n}
avg_q1USASeg 3{val, n}{val, n}{val, n}
avg_q1CanadaSeg 1{val, n}{val, n}{val, n}
avg_q1CanadaSeg 2{val, n}{val, n}{val, n}
avg_q1CanadaSeg 3{val, n}{val, n}{val, n}
avg_q2USASeg 1{val, n}{val, n}{val, n}
avg_q2USASeg 2{val, n}{val, n}{val, n}
avg_q2USASeg 3{val, n}{val, n}{val, n}
avg_q2CanadaSeg 1{val, n}{val, n}{val, n}
avg_q2CanadaSeg 2{val, n}{val, n}{val, n}
avg_q2CanadaSeg 3{val, n}{val, n}{val, n}