Defining aggregations

<< Click to Display Table of Contents >>

ETL > 14.0 > Implementation Guide > Tutorial and implementation guide > Steps > Mapping > Multiple columns 

Defining aggregations

Aggregations are methods, that - given a sequence of multiple input columns - produce a single value.

In current implementations, aggregations are used:

by multiple column mappings to define how the values from two or more columns are to be transformed into a single column

by join mapping, where aggregation defines the way of solving conflicts between similar columns in two or more tables.

 

The library supports the following aggregation methods:

Avg
Selects the average value of one or more values

Max
Select the maximum of one or more values

Min
Select the minimum of one or more values

Sum
Select the sum of one or more values

Coalesce (alias FirstNonNull)
Select first not-null value from the list of one or more values (from left to right). Note: Empty strings are treated as valid values, and not omitted.

FirstNonEmpty
Select first not-empty value from the list of one or more values (from left to right). Note: Both NULL and "" (empty string) are omitted.

Concat
Join all given non-empty values using a specific separator (from left to right). For this operation, a separator may be defined (default is a pipe character | ).
Bear in mind that this operation omits NULL values.

 

In places where the aggregation is expected, a simple string (one of the values from the table above) is sufficient to define the aggregation, the values are case insensitive. For example, this is how column mapping defines that the maximum of three values should be taken:

    "target-column-name": {

      "columns": ["column1", "column2", "column3"],

      "aggregation": "max"

    }

 

The string syntax is a shorthand form of the full object notation, which would have the following syntax:

    "target-column-name": {

      "columns": ["column1", "column2", "column3"],

      "aggregation": {

        "type""max"

      }

    }

 

Both forms are functionally equal. You can use either of these for all operations. However, to use the Concat method with a custom separator, the object notation must be used. The separator is defined by the separator property:

 

    "target-column-name": {

      "columns": ["column1", "column2", "column3"],

      "aggregation": {

        "type""concat",

        "separator"";"

      }

    }