Mapping

<< Click to Display Table of Contents >>

ETL > 14.0 > Implementation Guide > Tutorial and implementation guide > Steps 

Mapping

Mapping is a process which covers one of the following use cases:

Having a single input table, an output table is created. The new table has the exact number of rows as the first table, but the columns may be different.

Having several input tables, a single output table is created. The new table has the exact number of rows as all selected tables together, and contains all unique columns from the selected tables. This means that the mapping process can concatenate tables, ensuring that all columns are taken over.

 

Mapping step requires that the user defines the list of columns to be written in the new table. There are three ways to do it:

By specifying all required columns

By skipping the specification of required columns but setting the attribute mapRemaing to true.

By using both together - specifying only columns which will be transformed, and using mapRemaining to infer the remaining, undefined columns and include them as well.

 
For more information about map inferring remaining columns to map, read the following chapter: Inferring remaining columns.

 

A column may be taken-over or transformed. The following mappings are available:

oSimple mapping
A column may be simply taken over without any additional processing (simple mapping). The name may be taken as-is or changed to an arbitrary name.

oFixed values
A new column may be created, containing fixed values.

oAuto values
A new column may be created, containing values inserted dynamically (random numbers, date and time, GUIDs, placeholders).

oTransformed values
A new column may be created by transforming the existing column using a set of transform options (uppercasing, lowercasing, switch-case statements etc.)

oAggregated values
A new column may be created by aggregating two or more other columns (max/min value, average, concatenated string, first not-null value etc.)

oCustom values
A custom value may be calculated using SQL syntax.

 

Mapping uses the following syntax:

   {

    "id": 1, // unique ID

    "type": "map",

    "name": "Short description of the step",

    "description": "Long description of what this step does",

    "source": "Name of the source table",

    "columns": {

      // a dictionary of columns

      "TargetName1": {}, // definition of source1

      "TargetName2": {}, // definition of source2
      [...]

     },

    "target": "The name of the output table"

   }

 

At minimum the following properties are required:

ID (must be unique)

Type (must be set to "map")

Source (must be one of the following):

oA string representing the table name.

oA string representing a wildcard to look for table(s).

oAn integer representing the source as another step.

oAn object with property table set to the name of the source table.

oAn object with property step set to the ID of the source step.

oAn array of tables or steps to perform union select.

Either a non-empty list of columns, or the attribute mapRemaining set to true.
 

papercliper

Note:

Note: The target is optional. If omitted, the output table is saved temporarily and - unless not used by any other step - will be discarded once the transformation is finished.

 

Source is by default optional, meaning that the table does not have to be existing. If the table is missing, the step will not be executed and its target table will not be written. There is a way to define that the source is required, in which case in case of a missing source table the step will fail and report an error. More information about setting up required steps can be found in the following chapter: Optional and required tables.

Value types

The value type for each column is inferred from the context. To enforce an arbitrary column type (for example VARCHAR(128)) use the property type. More information about arbitrary types can be found in chapter Arbitrary column types