One of the most powerful capabilities of Informatica MDM is its ability to identify duplicate records and merge them into a single golden record. This process is driven by Match and Merge rules.
If you’re preparing for real-time projects or interviews, mastering Match & Merge in Informatica MDM is essential. In this tutorial, we’ll walk through the concepts, configuration steps, and practical examples.
What are Match and Merge Rules?
- Match Rules: Define how MDM determines whether two records represent the same entity (e.g., same customer).
- Merge Rules: Define how duplicate records are consolidated into one trusted record (golden record).
Example:
- Two records:
- John Smith, Email:
john.smith@gmail.com - J. Smith, Email:
john.smith@gmail.com
- John Smith, Email:
- Match Rule: Same email + similar name.
- Merge Rule: Pick the most trusted data from both and create a golden record.
Key Concepts to Understand
1. Tokenization
MDM breaks down data into tokens for comparison (e.g., “John Smith” → “John”, “Smith”).
2. Trust and Survivorship
- Trust = priority of source system (e.g., CRM > ERP > Excel).
- Survivorship = how final golden record values are selected (highest trust wins, most recent update wins, etc.).
3. Match Types
- Exact Match – Example: Email ID must be the same.
- Fuzzy Match – Example: Names like “Jon” and “John” are considered similar.
- Probabilistic Match – Scores records based on similarity.
Step-by-Step: Configuring Match and Merge in Informatica MDM
Step 1: Create Match Columns
- Open Hub Console → Schema → Match Columns.
- Select the columns (e.g.,
Email,FirstName,LastName,Phone). - Define match key types: Exact / Fuzzy / Phonetic.
-- Example: Match Column Setup Column: EMAIL_ADDRESS Match Type: Exact Column: LAST_NAME Match Type: Fuzzy (Edit Distance: 2)
Step 2: Define Match Rules
- Navigate to Match Rule Set in Hub Console.
- Create a new Match Rule with logical conditions.
Example Rule:
- IF (EMAIL = Exact Match) OR (FIRST_NAME + LAST_NAME = Fuzzy Match)
- THEN consider records as duplicates.
Match Rule Example: (MatchColumn.EMAIL = Exact) OR (MatchColumn.FIRST_NAME = Fuzzy AND MatchColumn.LAST_NAME = Fuzzy)
Step 3: Configure Merge Settings
- Go to Merge Settings.
- Choose how records will merge:
- Auto Merge: System automatically merges based on rules.
- Manual Merge: Requires Data Steward approval.
- Define Trust Levels per source system.
-- Example Trust Settings CRM: 90 ERP: 80 Excel Import: 50
Result: If the same field is present in CRM and ERP, MDM picks the CRM value (higher trust).
Step 4: Run Match and Merge Job
- Execute Match Job from Hub Console.
- Review Matched Sets in the Match Results.
- Execute Merge Job to create Golden Records.
Example: Customer Data Deduplication
| Customer ID | Name | Source | |
|---|---|---|---|
| CUST101 | John Smith | john.smith@gmail.com | CRM |
| CUST202 | J. Smith | john.smith@gmail.com | ERP |
| CUST303 | Jon Smyth | john.smith@outlook.com | Excel |
After Match & Merge
- Golden Record:
- Name: John Smith (from CRM – highest trust)
- Email: john.smith@gmail.com
- Source Systems Linked: CRM, ERP, Excel
Best Practices for Match & Merge
- Always profile data quality before configuring match rules.
- Use standardization (via IDQ) for names, addresses, phone numbers.
- Start with conservative rules, expand gradually to avoid false merges.
- Monitor match scores and thresholds to fine-tune accuracy.
- Keep audit logs for merge history (important for compliance).
Common Issues Developers Face
- Over-merging: Too loose match rules causing unrelated records to merge.
- Under-merging: Too strict rules leaving duplicates unmerged.
- Performance Lag: Poor indexing or complex fuzzy logic slows jobs.
- Trust Conflicts: Wrong survivorship when trust levels are misconfigured.
FAQs
Q1. What happens if two records match but trust scores are equal?
Informatica applies survivorship tie-breakers (e.g., most recent update, highest source priority).
Q2. Can we undo a merge in Informatica MDM?
Yes, Data Stewards can unmerge records if needed, restoring original records.
Q3. What is a typical threshold for fuzzy matches?
Usually between 0.7–0.9, depending on data sensitivity.
Q4. How do I test match rules before applying to full dataset?
Use Sample Match option in Hub Console with small test datasets.
Q5. Is Match & Merge different in Customer 360 vs Product 360?
The underlying process is the same, but attributes and survivorship logic differ based on domain.





