Skip to main content

Creating New Fields

A new field is defined as an <expression>.

Rollups

These syntax examples a chained ternary condition operator, which is an if-then-else statement.

// Recode age variable into groups
(age < 18) ? null // 0 to 17 is coded as null
: (age < 30) ? 1 // 18 to 29 is coded as 1
: (age < 40) ? 2 // 30 to 39 is coded as 2
: (age < 50) ? 3 // 40 to 49 is coded as 3
: (age < 60) ? 4 // 50 to 59 is coded as 4
5 // 60+ is coded as 5
// Recode income variable into groups
(hhincome < 25000) ? 1 // $0 to <$25k is coded as 1
: (hhincome < 50000) ? 2 // $25k to <$50k is coded as 2
: (hhincome < 75000) ? 3 // $50k to <$75k is coded as 3
: (hhincome < 100000) ? 4 // $75k to <$100k is coded as 4
: 5 // $100k+ is coded as 5
tip

Field values can be numeric or strings. But in calculations, filters are likely more performant when groups are defined numerically.

// Recode state into integer Region (more performant)
state in ("AK", "HI") ? 1 // Pacific
: state in ("WA", "OR", "CA", "NV", "AZ", "UT", "ID", "WY") ? 2 // West
: state in ...
: null

// Recode state into string Region (less performant, but okay)
state in ("AK", "HI") ? "Pacific"
: state in ("WA", "OR", "CA", "NV", "AZ", "UT", "ID", "WY") ? "West"
: state in ...
: null

Rebasing

Expanding the base (include respondents)
// Re-base own_product so that it includes all respondents, not just those answering
ifnull(own_product,0) // converts all missing values to 0
Reducing the base (exclude respondents)
// Re-base like_product so that it only includes respondents who own it
own_product==1 ? like_product : null

Netting multiple vars

tip

When netting, it's important to consider missing values. Be sure to understand how or and and work against missing values. (Rob to provide links for that)

// Respondent owns at least one product:
(own_product_1==1 or own_product_2==1 or own_product_3==1) ? 1 : 0

Unrolling picks

Consider that there exists a list of 20 brands in a study. A question "Q10" in the survey randomly picks 3 brands per respondent and asks the respondent about only those 3.

The survey data are stored as:

respidpick1pick2pick3q10_pick1q10_pick2q10_pick3
1brand 1brand 3brand 4653
2brand 4brand 10brand 20341
3brand 1brand 7brand 18239
4brand 2brand 4brand 121910
5brand 3brand 13brand 16511

But to make calculations easier, we want to unroll the pick to give us one variable per brand:

respidq10_brand1q10_brand2q10_brand3q10_brand4...q10_brand20
16.53.
2...31
32....
4.1.9.
5..5..
note

I know some people like to create an additional restructured datafile with 3 fields: {"respid", "brand", "q10"}. That's fine too, but this platform seeks to handle multi-dimensional variables such as these in a way that restructuring isn't necessary. The Collections feature is helpful for this, because it allows an author to tell the system about the dimensions that exist in datasets.

The syntax to unroll:

// q10_brand1:
pick1==1 ? q10_pick1 :
pick2==1 ? q10_pick2 :
pick3==1 ? q10_pick3 :
null

// q10_brand2:
pick1==2 ? q10_pick1 :
pick2==2 ? q10_pick2 :
pick3==2 ? q10_pick3 :
null

...

Here's what it would look like in an addFields statement:

{
$addFields: {
q10_brand1: "(pick1==1) ? q10_pick1 : (pick2==1) ? q10_pick2 : (pick3==1) ? q10_pick3 : null",
q10_brand2: "(pick1==2) ? q10_pick1 : (pick2==2) ? q10_pick2 : (pick3==2) ? q10_pick3 : null",
q10_brand3: "(pick1==3) ? q10_pick1 : (pick2==3) ? q10_pick2 : (pick3==3) ? q10_pick3 : null",
q10_brand4: "(pick1==4) ? q10_pick1 : (pick2==4) ? q10_pick2 : (pick3==4) ? q10_pick3 : null",
q10_brand5: "(pick1==5) ? q10_pick1 : (pick2==5) ? q10_pick2 : (pick3==5) ? q10_pick3 : null",
q10_brand6: "(pick1==6) ? q10_pick1 : (pick2==6) ? q10_pick2 : (pick3==6) ? q10_pick3 : null",
q10_brand7: "(pick1==7) ? q10_pick1 : (pick2==7) ? q10_pick2 : (pick3==7) ? q10_pick3 : null",
q10_brand8: "(pick1==8) ? q10_pick1 : (pick2==8) ? q10_pick2 : (pick3==8) ? q10_pick3 : null",
q10_brand9: "(pick1==9) ? q10_pick1 : (pick2==9) ? q10_pick2 : (pick3==9) ? q10_pick3 : null",
q10_brand10: "(pick1==10) ? q10_pick1 : (pick2==10) ? q10_pick2 : (pick3==10) ? q10_pick3 : null",
q10_brand11: "(pick1==11) ? q10_pick1 : (pick2==11) ? q10_pick2 : (pick3==11) ? q10_pick3 : null",
q10_brand12: "(pick1==12) ? q10_pick1 : (pick2==12) ? q10_pick2 : (pick3==12) ? q10_pick3 : null",
q10_brand13: "(pick1==13) ? q10_pick1 : (pick2==13) ? q10_pick2 : (pick3==13) ? q10_pick3 : null",
q10_brand14: "(pick1==14) ? q10_pick1 : (pick2==14) ? q10_pick2 : (pick3==14) ? q10_pick3 : null",
q10_brand15: "(pick1==15) ? q10_pick1 : (pick2==15) ? q10_pick2 : (pick3==15) ? q10_pick3 : null",
q10_brand16: "(pick1==16) ? q10_pick1 : (pick2==16) ? q10_pick2 : (pick3==16) ? q10_pick3 : null",
q10_brand17: "(pick1==17) ? q10_pick1 : (pick2==17) ? q10_pick2 : (pick3==17) ? q10_pick3 : null",
q10_brand18: "(pick1==18) ? q10_pick1 : (pick2==18) ? q10_pick2 : (pick3==18) ? q10_pick3 : null",
q10_brand19: "(pick1==19) ? q10_pick1 : (pick2==19) ? q10_pick2 : (pick3==19) ? q10_pick3 : null",
q10_brand20: "(pick1==20) ? q10_pick1 : (pick2==20) ? q10_pick2 : (pick3==20) ? q10_pick3 : null",
}
}