Data Source

Reads records from a local file, database or API.

Local file:

Reading SPSS files
Reading CSV files
Reading Excel files

Database:

Reading from SQL Server
Reading from Mongo DB

API:

Reading from an API

note

todo: Document the weight option.

Reading SPSS files

Files of type .sav (SPSS file) and .zsav (compressed SPSS file) are supported.

Set "type" equal to "file".

Provide a "relativeFileName" which is either a full filename with path, or a partial filename whose path is relative to the path of the project file. (Watch case sensitivity on relativeFileName).

{
    "type": "file",
    "relativeFileName": "something.zsav"
}

SPSS Options

No options are currently supported for SPSS files.

Reading CSV files

{
    "type": "file",
    "relativeFileName": "something.csv",
    "options": { 
        // see CSV options below
    }
}

CSV Options

hasTableNameRow is a boolean value specifying whether or not the first row of the CSV file should be treated as table name. This is deprecated and only exists for legacy imports.

hasFieldTypesRow is a boolean value specifying whether or not there's a row preceding the field names row which specifies the type for each field. This is just one way to specify field types. The other way is to provide an array of fields (see below).

Supported field types (case insensitive) are:

number, numeric, int, double, float (these are all treated as "number")
string
bool, boolean
date, dateTime
object
labelCell
resultCell

fields is an array of field specifications. This provides a way to specify exactly which fields to read from the file. There always must exist a row in the data file that contains field names. This fields array just tells the data reader which of those fields in the data file to use and which to ignore. It also can be used to specify field types.

Each field entry should be a JSON object with the following properties:

name (required) string
type (optional) string. See Supported field types above. If specified, this will overwrite type specified in types row (if exists).
label (optional) string. Not implemented.

{
    "type": "file",
    "relativeFileName": "something.csv",
    "options": {
        "fields": [
            { "name": "uid", "type": "string" },
            { "name": "product_id", "type": "int" },
            { "name": "product_label", "type": "string", "label": "Product Description" },
            { "name": "price", "type": "number" },
        ]
    }
}

Reading Excel files

Files of type .xslx (Office Open XML Excel) are supported.

Set "type" equal to "file".

Provide a "relativeFileName" which is either a full filename with path, or a partial filename whose path is relative to the path of the project file. (Watch case sensitivity on relativeFileName).

By default the starting cell is A1 of the first sheet, but this can be changed using options. Table columns are determined by seeking to the right of the starting cell until a blank cell is found. Table rows are determined by seeking down until a row is found which contains empty cells for every table column. If filtering is enabled in Excel, that filter will be ignored.

{
    "type": "file",
    "relativeFileName": "something.xlsx",
    "options": { 
        // see Excel options below
    }
}

Excel Options

Selected a sheet

Excel files may contain multiple sheets. By default, the first sheet will be used and all other sheets will be ignored. To specify exactly which sheet to use, add one of the following options:

sheetName (NOT IMPLEMENTED) specifies the name of the sheet from which to read data. This takes priority over sheetIndex if both are provided.

{
    "type": "file",
    "relativeFileName": "something.xlsx",
    "options": {
        "sheetName": "Table 1"
    }
}

sheetIndex (NOT IMPLEMENTED) is the 1-based index of the sheet from which to read data.

{
    "type": "file",
    "relativeFileName": "something.xlsx",
    "options": {
        "sheetIndex": 2 // uses the 2nd sheet
    }
}

Selecting a starting row

startRow is the first row from which to start reading data. This is a 1-based index, so it matches the row numbering shown in Excel. If startRow is not specified, the first row will be used by default.

{
    "type": "file",
    "relativeFileName": "something.xlsx",
    "options": {
        "startRow": 4
    }
}

note

In addition to startRow, I should have startColumn and startCell. I should also support the fields array that exists for reading CSV files.

Reading from SQL Server

Not documented

Reading from Mongo DB

Not documented

Reading from an API

Not documented

Data Source

Reading SPSS files​

SPSS Options​

Reading CSV files​

CSV Options​

Reading Excel files​

Excel Options​

Selected a sheet​

Selecting a starting row​

Reading from SQL Server​

Reading from Mongo DB​

Reading from an API​

Reading SPSS files

SPSS Options

Reading CSV files

CSV Options

Reading Excel files

Excel Options

Selected a sheet

Selecting a starting row

Reading from SQL Server

Reading from Mongo DB

Reading from an API