Add Window Fields
Adds field(s) by performing operations on a specified span of records, known as a window.
Operations can include:
- Getting the rank of a record, given a specified sort order.
- Aggregating across the window to find mean, min, max, sum, etc.
Examples Below:
Calculate a Rank (No partitioning)
Windows are defined by partitioning. When a partition isn't specified, there is just one window and it contains all records.
Input records:
product | score |
---|---|
Product 01 | 0.3461 |
Product 02 | 0.3096 |
Product 03 | 0.7573 |
Product 04 | 0.8856 |
Product 05 | 0.3959 |
Stage definition:
We are defining a stage that will sort the input records by score (highest first) and return a rank.
{
// note: No partitioning defined, so the window is all records.
// sort is required for ranking
"sortBy": [
{ "field": "score", "order": "descending" }
],
// new output field to add
"output": {
"high_score_rank": "rank()" // rank() is the operation
}
}
Output records:
Notice below that 'Product 04' has the highest score and received a rank of 1. 'Product 02' has the lowest score and received a rank of 5.
product | score | high_score_rank |
---|---|---|
Product 01 | 0.3461 | 4 |
Product 02 | 0.3096 | 5 |
Product 03 | 0.7573 | 2 |
Product 04 | 0.8856 | 1 |
Product 05 | 0.3959 | 3 |
Calculate a Rank within Window
Input records:
country | product | score |
---|---|---|
USA | Product 01 | 0.3461 |
USA | Product 02 | 0.3096 |
USA | Product 03 | 0.7573 |
USA | Product 04 | 0.8856 |
USA | Product 05 | 0.3959 |
Canada | Product 01 | 0.4744 |
Canada | Product 02 | 0.7804 |
Canada | Product 03 | 0.7071 |
Canada | Product 04 | 0.2232 |
Canada | Product 05 | 0.5828 |
Stage definition:
This stage definition is similar to the example above, but we are adding partitioning by country. This means that there will be a window for each country, and a separate ranking will be calculated for each window.
{
// this defines the window
"paritionBy": "country",
// sort is required for ranking
"sortBy": [
{ "field": "score", "order": "descending" }
],
// new output field to add
"output": {
"high_score_rank_within_country": "rank()" // this is the window operation
}
}
Output records:
Notice that the ranking occurs within country, because partitioning made each country a separate window.
country | product | score | high_score_rank_within_country |
---|---|---|---|
USA | Product 01 | 0.3461 | 4 |
USA | Product 02 | 0.3096 | 5 |
USA | Product 03 | 0.7573 | 2 |
USA | Product 04 | 0.8856 | 1 |
USA | Product 05 | 0.3959 | 3 |
Canada | Product 01 | 0.4744 | 4 |
Canada | Product 02 | 0.7804 | 1 |
Canada | Product 03 | 0.7071 | 2 |
Canada | Product 04 | 0.2232 | 5 |
Canada | Product 05 | 0.5828 | 3 |
Calculate a Net
In this example we will average across records (rows).
Taking an average across fields (columns) would be done using the AddFields stage.
Input records:
country | product | score |
---|---|---|
USA | Product 01 | 0.3461 |
USA | Product 02 | 0.3096 |
USA | Product 03 | 0.7573 |
USA | Product 04 | 0.8856 |
USA | Product 05 | 0.3959 |
Canada | Product 01 | 0.4744 |
Canada | Product 02 | 0.7804 |
Canada | Product 03 | 0.7071 |
Canada | Product 04 | 0.2232 |
Canada | Product 05 | 0.5828 |
Stage definition:
This time we are defining two output fields. One calculates an average and one calculates a max. And since we are partitioning by Country, a separate calculation window will be created for each country.
{
// this defines the window (optional)
"paritionBy": "country",
// note: sort is not required because we aren't ranking
// new output field to add
"output": {
"avg_score": "avg(score)"
"high_score": "max(score)"
}
}
Output records:
Notice that since the average and max calculations below were windowed by country, one avg_score is shared by all USA records, and a different avg_score is shared by Canada records.
country | product | score | avg_score | high_score |
---|---|---|---|---|
USA | Product 01 | 0.3461 | 0.5389 | 0.8856 |
USA | Product 02 | 0.3096 | 0.5389 | 0.8856 |
USA | Product 03 | 0.7573 | 0.5389 | 0.8856 |
USA | Product 04 | 0.8856 | 0.5389 | 0.8856 |
USA | Product 05 | 0.3959 | 0.5389 | 0.8856 |
Canada | Product 01 | 0.4744 | 0.55358 | 0.7804 |
Canada | Product 02 | 0.7804 | 0.55358 | 0.7804 |
Canada | Product 03 | 0.7071 | 0.55358 | 0.7804 |
Canada | Product 04 | 0.2232 | 0.55358 | 0.7804 |
Canada | Product 05 | 0.5828 | 0.55358 | 0.7804 |