Aggregate
The aggregate stage calculates aggregation functions (e.g., mean, percent, freq, and more) on the input records and returns a consolidated set of records (in total and/or by comparison groups).
- Jump to Examples below
Stage Editor UI
Stage as JSON
{
$aggregate: {
// note: there must exist one (and only one) measures axis, located in either rows or cols
rows: <array of axes>,
cols: <array of axes>
}
}
Measures Axis
There is only one Measures Axis and it must have at least one Measure defined. In the stage UI, click the Axis button to edit measures.
Learn about the Measures Axis.
Groups Axes
The Groups Axes are optional. Measures are calculated for each group (data cut) in a Groups Axis. If multiple Groups Axes exist, a nested expansion of all groups axes is output.
Click here to learn about the Groups Axis.
Layout
Layout determines where each Axis is returned: as rows or columns, and in what order.
In the UI, you may click and drag an axis to re-order within rows or columns container, or move it into the opposite container.
Click here to learn about Layout (todo).
Examples:
Click section to expand.
Input records
Input records
respid | q1 | seg | country | wave |
---|---|---|---|---|
1 | 8 | 1 | 1 | 1 |
2 | 2 | 1 | 5 | 1 |
3 | 4 | 1 | 3 | 2 |
4 | 9 | 2 | 1 | 1 |
5 | 1 | 2 | 2 | 1 |
... | ... | ... | ... | ... |
1000 | 3 | 4 | 4 | 2 |
Example: One measure
Aggregation stage
- Code
- UI
{
$aggregation: {
cols: [
{
// type: "measures", (assumed)
measures: {
m1: "mean(q1)"
}
}
]
}
}
[TODO: ADD IMAGES]
Learn about the UI Editor for Measures.
Output records:
m1.val | m1.n |
---|---|
3.5 | 945 |
Example: One measure and groups
Aggregation stage with groups:
- Code
- UI
{
$aggregation: {
cols: [
{
// measures dimension
measures: {
m1: "mean(q1)"
}
}
],
rows: [
{
// this is an "any" group dimension (doesn't specify exact groups)
id: "segment",
syntax: "segment"
}
],
}
}
[TODO: ADD IMAGES]
Learn about the UI Editor for Measures or the UI Editor for Groups.
Output records:
segment | m1.val | m1.n |
---|---|---|
1 | 2.5 | 250 |
2 | 1.7 | 250 |
3 | 4.3 | 250 |
4 | 2.9 | 250 |
Example: One measure and specific groups
Aggregation stage with specified groups:
- Code
- UI
{
$aggregation: {
cols: [
{
measures: {
m1: "mean(q1)"
}
},
]
rows: [
{
id: "segment",
groups: {
{ syntax: "segment==1", label: "Segment one" },
{ syntax: "segment==2", label: "Segment two" }
{ syntax: "segment in (3,4)", label: "Segment three or four" }
}
}
],
}
}
[TODO: ADD IMAGES]
Learn about the UI Editor for Measures or the UI Editor for Groups.
Output records:
segment.id | segment.syntax | segment.label | m1.val | m1.n |
---|---|---|---|---|
seg1 | segment==1 | Segment one | 2.5 | 250 |
seg2 | segment==2 | Segment two | 1.7 | 250 |
seg_other | segment in (3,4) | Segment three or four | 3.5 | 500 |
Example: Measures and multiple group dims
Aggregation stage more complex example
- Code
- UI
Todo: I changed the syntax.
{
$aggregation: {
measures: {
m1: { syntax: "pct(q1 in (1,2,3))", label: "Q1 Bottom Box" },
m2: { syntax: "pct(q1 in (4,5,6,7))", label: "Q1 Indifferent" },
m3: { syntax: "pct(q1 in (8,9))", label: "Q1 Top Box" },
},
groups: {
country: "country",
segment: {
seg1: "segment==1",
seg2: "segment==2"
},
wave: {
w1: { syntax: "wave==1", label: "Last Year" },
w2: { syntax: "wave==2", label: "YTD" },
}
},
layout: {
cols: ["segment"],
rows: ["country", "wave", "m"]
}
}
}
[TODO: ADD IMAGES]
Learn about the UI Editor for Measures or the UI Editor for Groups.
Output records:
Some output fields have been hidden from the view below: "wave.id", "wave.syntax", "m.id" and "m.syntax" would also return.
country | wave.label | m.label | seg1.val | seg1.n | seg2.val | seg2.n |
---|---|---|---|---|---|---|
Canada | Last Year | Q1 Bottom Box | (value) | (n) | (value) | (n) |
Canada | Last Year | Q1 Indifferent | (value) | (n) | (value) | (n) |
Canada | Last Year | Q1 Top Box | (value) | (n) | (value) | (n) |
Canada | YTD | Q1 Bottom Box | (value) | (n) | (value) | (n) |
Canada | YTD | Q1 Indifferent | (value) | (n) | (value) | (n) |
Canada | YTD | Q1 Top Box | (value) | (n) | (value) | (n) |
Spain | Last Year | Q1 Bottom Box | (value) | (n) | (value) | (n) |
Spain | Last Year | Q1 Indifferent | (value) | (n) | (value) | (n) |
Spain | Last Year | Q1 Top Box | (value) | (n) | (value) | (n) |
Spain | YTD | Q1 Bottom Box | (value) | (n) | (value) | (n) |
Spain | YTD | Q1 Indifferent | (value) | (n) | (value) | (n) |
Spain | YTD | Q1 Top Box | (value) | (n) | (value) | (n) |
Full Code Reference (scratch - don't read me)
Ignore this stuff.
Stage syntax Reference
The aggregate stage takes 3 parameters:
- measures (required) - takes one input dimension that contains the measures.
- groups (optional) - defines grouping dimensions if any
- layout (required) - specifies where to put the dimensions (in rows vs cols) and in what order
{
$aggregate: {
measures: <measures-definition>
groups: <groups-definition>
layout: <layout-definition>
}
}
<measures-definition>
The short form does not specify the id of the measures axis, so an id of "m" is assumed.
measures: {
m1: <measure>, // "m1" is the id of the measure
m2: <measure>,
m3: <measure>,
}
The longer form gives the measures axis an id
measures: {
m: { // m is the name of the axis
m1: <measure>,
m2: <measure>,
m3: <measure>,
}
}
Or should I use this form instead?
measures: {
_id: "m", // m is the name of the axis
m1: <measure>,
m2: <measure>,
m3: <measure>,
}
<measure>
A measure can be defined as a string if only the syntax is needed.
"pct(q1==1)"
If a measure needs a label or any other information, an object can be used.
{ syntax: "pct(q1==1)", label: "Satisfied", something_else: "#0064c8" }
Example usage:
measures: {
m1: "pct(q1==1)",
m2: { syntax: "pct(q1==2)", label: "satisfied", weight: "w1" }
}
<group>
Define groups
groups: {
g: "country", // this is a stardim because it is a string and contains syntax only
h: {
h1: "segment==1", // shorthand (syntax only)
h2: "segment==2",
h3: { syntax: "segment==3", label: "Outdoor consumer" }
}
}
Define layout
If layout is missing, what about this??
- Measures will go in columns IF no groups are specified
- If any (regular?) groups are specified, Measures will go in rows
- If any stardim groups are specified, they will go in rows
- All other group dims will go in columns
- Or should I just require layout?
layout: {
rows: ["m", "g"], // g cannot be cols since it's a stardim
cols: ["h"]
}
Do a mongodb $group
{
$aggregate: {
$groups: {
country: "country",
segment: "segment"
},
$measures: {
avg_q1: "avg(q1)",
avg_q2: "avg(q2)"
}
// layout missing, so according to rules above,
// measures will be columns
// stardims will be rows
// seems pretty close to mongodb syntax
}
}
Enhancing MongoDB group
{
$aggregate: {
$groups: {
country: "country",
segment: "segment",
own: { // specify exact groups (overlapping, from different vars)
own_honda: { syntax: "own_01==1", label: "Own Honda" },
own_toyota: { syntax: "own_02==1", label: "Own Toyota" },
own_ford: { syntax: "own_03==1", label: "Own Ford" },
}
},
$measures: {
avg_q1: "avg(q1)",
avg_q2: "avg(q2)"
}
// layout missing, so according to rules above,
// measures will be columns
// stardims (country, segment) will be rows
// groupdims will be cols?
}
}
| country | segment | own_honda | own_toyota | own_ford | | country | segment | avg_q1 | avg_q2 | avg_q1 | avg_q2 | avg_q1 | avg_q2 | | ------- | ------- | ------ | ------ | ------ | ------ | ------ | ------ | | USA | Seg 1 |{val, n}|{val, n}|{val, n}|{val, n}|{val, n}|{val, n}| | USA | Seg 2 |{val, n}|{val, n}|{val, n}|{val, n}|{val, n}|{val, n}| | USA | Seg 3 |{val, n}|{val, n}|{val, n}|{val, n}|{val, n}|{val, n}| | Canada | Seg 1 |{val, n}|{val, n}|{val, n}|{val, n}|{val, n}|{val, n}| | Canada | Seg 2 |{val, n}|{val, n}|{val, n}|{val, n}|{val, n}|{val, n}| | Canada | Seg 3 |{val, n}|{val, n}|{val, n}|{val, n}|{val, n}|{val, n}|
m | country | segment | own_honda | own_toyota | own_ford |
---|---|---|---|---|---|
avg_q1 | USA | Seg 1 | {val, n} | {val, n} | {val, n} |
avg_q1 | USA | Seg 2 | {val, n} | {val, n} | {val, n} |
avg_q1 | USA | Seg 3 | {val, n} | {val, n} | {val, n} |
avg_q1 | Canada | Seg 1 | {val, n} | {val, n} | {val, n} |
avg_q1 | Canada | Seg 2 | {val, n} | {val, n} | {val, n} |
avg_q1 | Canada | Seg 3 | {val, n} | {val, n} | {val, n} |
avg_q2 | USA | Seg 1 | {val, n} | {val, n} | {val, n} |
avg_q2 | USA | Seg 2 | {val, n} | {val, n} | {val, n} |
avg_q2 | USA | Seg 3 | {val, n} | {val, n} | {val, n} |
avg_q2 | Canada | Seg 1 | {val, n} | {val, n} | {val, n} |
avg_q2 | Canada | Seg 2 | {val, n} | {val, n} | {val, n} |
avg_q2 | Canada | Seg 3 | {val, n} | {val, n} | {val, n} |