Configuration

<< Click to Display Table of Contents >>

Raynet One Data Hub > 2026.2 > Connectors > Alphabetic Connector List > Deduplicator > Connector Parameters 

Configuration

Create Legacy Tables

When enabled, the connector creates additional legacy-format output tables alongside the standard deduplication result tables. Enable this option only when integration with older downstream systems requires it.

 

Technical Name

createLegacyTables

Category

Functional

Type

Boolean

Default Value

false

Example Values

true

 

Type

Selects the entity type to deduplicate. Each task instance handles one entity type. To deduplicate devices, software, and users, create a separate connector task for each.

 

Technical Name

type

Category

Functional

Type

Select

Options

devices — Device Deduplication

software — Software Deduplication

users — User Deduplication

Required

Yes

 

Deduplicate Detection Settings

Defines how closely records must match to be considered duplicates. Select a preset profile or use Custom to define your own thresholds.

 

Broad

Flags records with lower similarity. Captures more potential duplicates but may include false positives.

Standard

Balanced matching for most scenarios. Provides a good trade-off between coverage and precision. Recommended for most deployments.

Precise

Only flags highly similar records. Reduces false positives but may miss some duplicates.

Custom

Define your own matching thresholds and criteria. Exposes additional settings for field weights, matching thresholds, detection options, source priorities, and merge strategies.

 

Field Weight Settings (Custom Mode)

Available when Custom detection is selected. Defines the relative importance of each field when calculating the overall similarity score between two records. Fields with higher weights have greater influence on the matching decision.

 

The following fields are configurable per entity type:

 

Devices

DeviceId, SerialNumber, Uuid, Bios, HostName, Fqdn, Domain, MacAddress, IpAddress

Software

Product, Publisher, ProductVersion, Edition, LanguageCode

Users

Email, UserPrincipalName, Login, UserKey, ImportId, FirstName, LastName

Applies to

Custom detection mode only

 

Detection Settings (Custom Mode)

Available when Custom detection is selected. Controls fine-grained duplicate identification behavior.

 

Global Similarity Threshold

The minimum overall similarity score required for two records to be flagged as duplicates. Lower values increase recall; higher values increase precision.

ML Matching

When enabled, the connector uses machine learning to identify duplicates based on patterns in the data, in addition to the configured field weights.

Strong Identifier Match

When enabled, at least one critical identifier field (such as Serial Number or UUID) must match exactly before any other similarity factors are considered. This prevents false positives caused by coincidental similarity on less reliable fields.

 

Source Priority Settings

Determines which data source is preferred when merging conflicting field values from duplicate records. The source with the highest priority provides its field values to the merged record.

 

Two priority modes are available:

Standard — Uses the default source priority order defined for each entity type.

Custom — Allows you to define a custom global source priority order, and optionally override the priority for specific fields individually.

 

Field Merge Strategies

Controls how field values are merged when duplicate records are combined. Two modes are available:

Standard — Uses the default merge behavior, which respects the configured source priority.

Customize Default Fields — Allows you to define per-field merge strategies. The following strategies are available:

 

default

Uses the source priority order to select the field value.

min

Selects the minimum value across all duplicate records.

max

Selects the maximum value across all duplicate records.

concatenate

Combines all distinct values from duplicate records into a single concatenated string.

coalesce

Uses the first non-empty value found across duplicate records, in source priority order.

 

Write Field Scores

When enabled, the connector writes per-field similarity scores for each detected duplicate pair to an additional output table. This is useful for debugging matching behavior or tuning detection thresholds. Note that enabling this option produces a large table and significantly increases processing time.

 

Technical Name

write_field_scores

Category

Functional

Type

Boolean

Default Value

false

Example Values

true

 

SQL Query Timeout

The maximum time in seconds that the connector waits for a SQL query to complete. Increase this value for large datasets or slow network connections.

 

Technical Name

sql_timeout

Category

Functional

Type

Integer

Default Value

600

Allowed Range

30 – 3600 seconds