Alphabetic Connector List > Deduplicator > Connector Parameters

Create Legacy Tables

When enabled, the connector creates additional legacy-format output tables alongside the standard deduplication result tables. Enable this option only when integration with older downstream systems requires it.

Technical Name	createLegacyTables
Category	Functional
Type	Boolean
Default Value	false
Example Values	true

Type

Selects the entity type to deduplicate. Each task instance handles one entity type. To deduplicate devices, software, and users, create a separate connector task for each.

Technical Name	type
Category	Functional
Type	Select
Options	•devices — Device Deduplication •software — Software Deduplication •users — User Deduplication
Required	Yes

Deduplicate Detection Settings

Defines how closely records must match to be considered duplicates. Select a preset profile or use Custom to define your own thresholds.

Broad	Flags records with lower similarity. Captures more potential duplicates but may include false positives.
Standard	Balanced matching for most scenarios. Provides a good trade-off between coverage and precision. Recommended for most deployments.
Precise	Only flags highly similar records. Reduces false positives but may miss some duplicates.
Custom	Define your own matching thresholds and criteria. Exposes additional settings for field weights, matching thresholds, detection options, source priorities, and merge strategies.

Field Weight Settings (Custom Mode)

Available when Custom detection is selected. Defines the relative importance of each field when calculating the overall similarity score between two records. Fields with higher weights have greater influence on the matching decision.

The following fields are configurable per entity type:

Devices	DeviceId, SerialNumber, Uuid, Bios, HostName, Fqdn, Domain, MacAddress, IpAddress
Software	Product, Publisher, ProductVersion, Edition, LanguageCode
Users	Email, UserPrincipalName, Login, UserKey, ImportId, FirstName, LastName
Applies to	Custom detection mode only

Detection Settings (Custom Mode)

Available when Custom detection is selected. Controls fine-grained duplicate identification behavior.

Global Similarity Threshold	The minimum overall similarity score required for two records to be flagged as duplicates. Lower values increase recall; higher values increase precision.
ML Matching	When enabled, the connector uses machine learning to identify duplicates based on patterns in the data, in addition to the configured field weights.
Strong Identifier Match	When enabled, at least one critical identifier field (such as Serial Number or UUID) must match exactly before any other similarity factors are considered. This prevents false positives caused by coincidental similarity on less reliable fields.

Source Priority Settings

Determines which data source is preferred when merging conflicting field values from duplicate records. The source with the highest priority provides its field values to the merged record.

Two priority modes are available:

•Standard — Uses the default source priority order defined for each entity type.

•Custom — Allows you to define a custom global source priority order, and optionally override the priority for specific fields individually.

Field Merge Strategies

Controls how field values are merged when duplicate records are combined. Two modes are available:

•Standard — Uses the default merge behavior, which respects the configured source priority.

•Customize Default Fields — Allows you to define per-field merge strategies. The following strategies are available:

default	Uses the source priority order to select the field value.
min	Selects the minimum value across all duplicate records.
max	Selects the maximum value across all duplicate records.
concatenate	Combines all distinct values from duplicate records into a single concatenated string.
coalesce	Uses the first non-empty value found across duplicate records, in source priority order.

Write Field Scores

When enabled, the connector writes per-field similarity scores for each detected duplicate pair to an additional output table. This is useful for debugging matching behavior or tuning detection thresholds. Note that enabling this option produces a large table and significantly increases processing time.

Technical Name	write_field_scores
Category	Functional
Type	Boolean
Default Value	false
Example Values	true

SQL Query Timeout

The maximum time in seconds that the connector waits for a SQL query to complete. Increase this value for large datasets or slow network connections.

Technical Name	sql_timeout
Category	Functional
Type	Integer
Default Value	600
Allowed Range	30 – 3600 seconds

Configuration