<< Click to Display Table of Contents >> ETL > 14.0 > Implementation Guide > Tutorial and implementation guide > Steps > Enriching Split look-up |
Split look-up describes the following scenario:
•There is an input table to be enriched with data from another table(s).
•There is another table which has the information to be enriched with.
•However, the second table does not have a normalized column that could be used for matching, because its matching key candidates are aggregated - for example separated by an arbitrary separator.
Consider the following table:
Table EComputer
Computer |
IpAddress |
Domain |
---|---|---|
HOST1 |
192.168.170.1 |
DomainA |
HOST2 |
192.168.170.2 |
<null> |
HOST3 |
192.168.170.6 |
<null> |
Table EAddressDomain
IpAddress |
Domain |
---|---|
192.168.170.1; 192.168.170.2 |
DomainB |
192.168.170.3; 192.168.170.4 |
DomainC |
192.168.170.5 |
DomainD |
We want to enrich the first table, so that its <null> values in the Domain column are set to matching values from EAddressDomain (column Domain). To identify and lookup a proper device, we use its IP address. However, there is no straightforward way to do this, because the table EAddressDomain contains aggregated lookup keys, in form of the column IpAddress, where the values are separated with "; " (semicolon followed by space).
A workaround would be to use another Split step to convert the second table to a normalized version, and then continue with the enrichment. The same can be achieved much quicker and with a less verbose way, by using an automatic cell separator.
{
"id": 1,
"name": "Enrich EComputer with EAddressDomain",
"type": "enrich",
"source": "EComputer",
"target": "EComputer_enriched",
"column": "Domain",
"lookups": [
{
"in": "EAddressDomain",
"key": "IpAddress",
"separator": "; ",
"take": "Domain"
},
{
"take": { "value": "n/a" }
}
]
}S
After adding a separator and executing the JSON, the following table will be created:
Table EComputer_enriched
Computer |
IpAddress |
Domain |
---|---|---|
HOST1 |
192.168.170.1 |
DomainA |
HOST2 |
192.168.170.2 |
DomainB |
HOST3 |
192.168.170.6 |
n/a |
Remarks and points of interest:
•HOST1 remained unchanged, because it already had a value in the Domain column.
•HOST2 received a value of DomainB, because its IP address 192.168.170.2 has been found in the list of IP addresses in the second table.
•HOST3 has received a value of n/a, because its IP address was not found in any of the list of IP addresses in the second table. More details about fallbacks can be found in this section.