Configure a Matching Step Using QuickStart

Before you begin

You need:

About this task

This task configures the matching step which is the first part of the split-step mastering process.

Procedure

  1. Navigate to the flow definition of the flow you want.

    QuickStart Flows - Manage Flows table - Click flow name

    1. In QuickStart's navigation bar, click Flows.
    2. In the Manage Flows table, search for the row containing the flow.
      Tip: To make your search easier, you can sort the table by one of the columns.
    3. Click the flow's name.
  2. In the flow sequence, click the summary box of the matching step to configure.


    The step detail panel is displayed below the flow sequence panel.


  1. Configure the Match Options.
    • To add a new option, click the Add button.
    • To edit an option, click the vertical ellipsis (⋮) and choose Edit Settings.
    • To delete an option, click the vertical ellipsis (⋮) and choose Delete.
    Match Options fields
    Field Description
    Type The matching type.
    • Exact. Determines if the values of the specified property in two or more records are exactly the same.
    • Synonym. Determines if the values of the specified property in two or more records are synonyms, according to the specified thesaurus.
    • Double Metaphone. Determines if the values of the specified property in two or more records sound similar, based on the Double Metaphone algorithm. For example, "Smith" might sound like "Schmidt".
    • Zip. Determines if the zip/postal code in two or more records match.
    • Reduce. Reduces the significance of certain matches. For example, even if the addresses and last names of two records match, the similarity might not necessarily indicate that the two records refer to the same person, because they might be two members of the same family.
    • Custom. Runs a function in your custom module to compare the values of a specified property in two or more records.

    Exact match properties


    Synonym properties


    Double Metaphone properties


    Zip properties


    Reduce properties


    Custom match properties

    Property to Match The property whose values to compare.
    Properties to Match One or more properties whose values to compare.
    Weight A factor that signifies the relative importance of the rule.
    Weight A positive integer that denotes how much to reduce the weight of a match.
    Thesaurus URI

    The location of the thesaurus that is stored in a MarkLogic Server database and used to determine synonyms. See also: Managing Thesaurus Documents

    Filter

    A node in the thesaurus to use as a filter. For example, <thsr:qualifier>birds</thsr:qualifier>.

    See the $filter parameter in thsr:expand.

    5-vs-9 Match Weight

    The weight to use if, given one 9-digit zip code and one 5-digit zip code, the first five digits of the zip codes match. Applicable only to US zip codes.

    9-vs-5 Match Weight

    The weight to use if, given two 9-digit zip codes, the first five digits match and the last four do not. Applicable only to US zip codes.

    Note: To add a weight when all nine digits match, use the Exact matching type to compare the zip codes as strings. If some of your data might have non-standard formats (e.g., using whitespace or em dash instead of the typical hyphen), remember to map your zip code fields to a standard format in your entity.
    Dictionary URI

    The location of the phonetic dictionary that is stored in a database and used when comparing words phonetically. See also: Custom Dictionaries

    Distance Threshold

    The threshold below which the phonetic difference (distance) between two strings is considered insignificant; i.e., the strings are similar to each other. See also: spell functions

    Collation

    The URI to the collation to use. A collation specifies the order for sorting strings. See also: Encodings and Collations

    URI The location of the custom module.
    Function The name of the custom function within the custom module.
    Namespace The namespace of the library module where the custom function is. Blank, if the custom function is JavaScript code.
  2. Configure the Match Thresholds.
    • To add a new threshold, click the Add button.
    • To edit a threshold, click the vertical ellipsis (⋮) and choose Edit Settings.
    • To delete a threshold, click the vertical ellipsis (⋮) and choose Delete.
    Match Thresholds fields

    Match threshold properties

    Field Description
    NameThe threshold rule name.
    Weight ThresholdThe threshold with which to compare the total weight of the matches.
    Action What to do if the total weight exceeds the Weight threshold.
    • Merge. Automatically merges the candidate records, according to the merging rules.
    • Notify. Sends a notification for a human to review the match and decide on the action to take.
    • Custom. Performs actions defined in a custom module.
    URI(Displayed if the action is Custom.) The location of the custom module.
    Function(Displayed if the action is Custom.) The name of the custom function within the custom module.
    Namespace(Displayed if the action is Custom.) The namespace of the library module where the custom function is. Blank, if the custom function is JavaScript code.

What to do next

Important: When running the split mastering, allow the matching step to complete before starting the merging step.