Splitting

<< Click to Display Table of Contents >>

ETL > 12.6 > Implementation Guide > Tutorial and implementation guide > Steps 

Splitting

Splitting is a process of taking an input table with N row, and producing N + X row (X >= 0) by splitting a value from a specific column.

Basic syntax for splitting:

    {

      "id": 1,

      "name""<name>",

    "description": "<description>",

      "type""split",

      "source""<source>",

      "target""<target_if_persisted>",

      "column""<the-column-to-split>"

    },

 

Source is by default optional, meaning that the table does not have to be existing. If the table is missing, the step will not be executed and its target table will not be written. There is a way to define that the source is required, in which case in case of a missing source table the step will fail and report an error. More information about setting up required steps can be found in the following chapter: Optional and required tables.

By default, a comma (,) is used as a separator. You can define your own separator by overriding the split property. By default, split is set to vertical (meaning that a row will be split into some more rows, this is currently the only value supported). To change the separator, use object notation:

{

      "id": 1,

    "name": "Short description of the step",

    "description": "Turns cars table from a pivoted table to an unpivoted one.",

      "type""split",

      "source""Cars_Pivot",

      "target""Cars_Split",

      "column""cars",

      "split": { "type""vertical""separator"";" }

    }

 

The separator can have more than one character. Bear in mind, that white-spaces are not trimmed, so it is generally up to you.

Example

Given the following input table Cars_Pivot:

Id

Name

Cars

1

Marcin

BMW,Renault

2

Juergen

Open,Peugeot

3

Ali

VW

4

Denis

Audi

And the following JSON definition:

    {

      "id": 1,

      "name""Unpivot cars table",

      "type""split",

      "source""Cars_Pivot",

      "target""Cars_Split",

      "column""cars"

    }

 

The following table Cars_Split is created:

Id

Name

Cars

1

Marcin

BMW

1

Marcin

Renault

2

Juergen

Open

2

Juergen

Peugeot

3

Ali

VW

4

Denis

Audi

 

papercliper

Note:

There is a special chapter in the Enrichment section (Split lookups) which shows a similar use-case of splitting the value in a column to make a look-up queries. It is recommended to use that approach if the table is used solely for the enrichment purpose.