Match and Merge Rules in Informatica MDM: A Complete Hands-On Guide

Match and Merge Rules in Informatica MDM: A Complete Hands-On Guide

One of the most powerful capabilities of Informatica MDM is its ability to identify duplicate records and merge them into a single golden record. This process is driven by Match and Merge rules.

If you’re preparing for real-time projects or interviews, mastering Match & Merge in Informatica MDM is essential. In this tutorial, we’ll walk through the concepts, configuration steps, and practical examples.

What are Match and Merge Rules?

  • Match Rules: Define how MDM determines whether two records represent the same entity (e.g., same customer).
  • Merge Rules: Define how duplicate records are consolidated into one trusted record (golden record).

Example:

  • Two records:
    • John Smith, Email: john.smith@gmail.com
    • J. Smith, Email: john.smith@gmail.com
  • Match Rule: Same email + similar name.
  • Merge Rule: Pick the most trusted data from both and create a golden record.

Key Concepts to Understand

1. Tokenization

MDM breaks down data into tokens for comparison (e.g., “John Smith” → “John”, “Smith”).

2. Trust and Survivorship

  • Trust = priority of source system (e.g., CRM > ERP > Excel).
  • Survivorship = how final golden record values are selected (highest trust wins, most recent update wins, etc.).

3. Match Types

  • Exact Match – Example: Email ID must be the same.
  • Fuzzy Match – Example: Names like “Jon” and “John” are considered similar.
  • Probabilistic Match – Scores records based on similarity.

Step-by-Step: Configuring Match and Merge in Informatica MDM

Step 1: Create Match Columns

  1. Open Hub Console → Schema → Match Columns.
  2. Select the columns (e.g., Email, FirstName, LastName, Phone).
  3. Define match key types: Exact / Fuzzy / Phonetic.
-- Example: Match Column Setup
Column: EMAIL_ADDRESS
Match Type: Exact
Column: LAST_NAME
Match Type: Fuzzy (Edit Distance: 2)

Step 2: Define Match Rules

  1. Navigate to Match Rule Set in Hub Console.
  2. Create a new Match Rule with logical conditions.

Example Rule:

  • IF (EMAIL = Exact Match) OR (FIRST_NAME + LAST_NAME = Fuzzy Match)
  • THEN consider records as duplicates.
Match Rule Example:
(MatchColumn.EMAIL = Exact) OR (MatchColumn.FIRST_NAME = Fuzzy AND MatchColumn.LAST_NAME = Fuzzy)

Step 3: Configure Merge Settings

  1. Go to Merge Settings.
  2. Choose how records will merge:
    • Auto Merge: System automatically merges based on rules.
    • Manual Merge: Requires Data Steward approval.
  3. Define Trust Levels per source system.
-- Example Trust Settings
CRM: 90
ERP: 80
Excel Import: 50

Result: If the same field is present in CRM and ERP, MDM picks the CRM value (higher trust).

Step 4: Run Match and Merge Job

  • Execute Match Job from Hub Console.
  • Review Matched Sets in the Match Results.
  • Execute Merge Job to create Golden Records.

Example: Customer Data Deduplication

Customer IDNameEmailSource
CUST101John Smithjohn.smith@gmail.comCRM
CUST202J. Smithjohn.smith@gmail.comERP
CUST303Jon Smythjohn.smith@outlook.comExcel

After Match & Merge

  • Golden Record:
    • Name: John Smith (from CRM – highest trust)
    • Email: john.smith@gmail.com
    • Source Systems Linked: CRM, ERP, Excel

Best Practices for Match & Merge

  • Always profile data quality before configuring match rules.
  • Use standardization (via IDQ) for names, addresses, phone numbers.
  • Start with conservative rules, expand gradually to avoid false merges.
  • Monitor match scores and thresholds to fine-tune accuracy.
  • Keep audit logs for merge history (important for compliance).

Common Issues Developers Face

  • Over-merging: Too loose match rules causing unrelated records to merge.
  • Under-merging: Too strict rules leaving duplicates unmerged.
  • Performance Lag: Poor indexing or complex fuzzy logic slows jobs.
  • Trust Conflicts: Wrong survivorship when trust levels are misconfigured.

FAQs

Q1. What happens if two records match but trust scores are equal?
Informatica applies survivorship tie-breakers (e.g., most recent update, highest source priority).

Q2. Can we undo a merge in Informatica MDM?
Yes, Data Stewards can unmerge records if needed, restoring original records.

Q3. What is a typical threshold for fuzzy matches?
Usually between 0.7–0.9, depending on data sensitivity.

Q4. How do I test match rules before applying to full dataset?
Use Sample Match option in Hub Console with small test datasets.

Q5. Is Match & Merge different in Customer 360 vs Product 360?
The underlying process is the same, but attributes and survivorship logic differ based on domain.