Configure a Mastering Step Using QuickStart

Before you begin

You need:

About this task

This task configures the mastering step which combines both matching and merging.

Procedure

  1. Navigate to the settings of the flow you want.

    QuickStart Flows - Manage Flows table - Click flow name

    1. In QuickStart's navigation bar, click Flows.
    2. In the Manage Flows table, search for the row containing the flow.
      Tip: To make your search easier, you can sort the table by one of the columns.
    3. Click the flow's name.
  2. In the flow sequence, click the summary box of the mastering step to configure.


    The step detail panel is displayed below the flow sequence panel.
Matching
  1. In the step detail panel at the bottom, click the Matching tab.


  2. Configure the Match Options.
    • To add a new option, click the Add button.
    • To edit an option, click the vertical ellipsis (⋮) and choose Edit Settings.
    • To delete an option, click the vertical ellipsis (⋮) and choose Delete.
    Match Options properties
    Name Description
    Type The matching type: Exact, Synonym, Double Metaphone, Zip, Reduce, and Custom.
    • Exact. Determines if the values of the specified entity property in two or more records are exactly the same.
    • Synonym. Determines if the values of the specified entity property in two or more records are synonyms, according to the specified thesaurus.
    • Double Metaphone. Determines if the values of the specified entity property in two or more records sound similar, based on the Double Metaphone algorithm. For example, "Smith" might sound like "Schmidt".
    • Zip. Determines if the zip/postal code in two or more records match.
    • Reduce. Reduces the significance of certain matches. For example, even if the addresses and last names of two records match, the similarity might not necessarily indicate that the two records refer to the same person, because they might be two members of the same family.
    • Custom. Runs a function in your custom module to compare the values of a specified entity property in two or more records.
    Additional settings for the Exact match type

    Exact match properties

    Property to Match The property whose values to compare.
    Weight A factor that signifies the relative importance of the rule.
    Additional settings for the Synonym match type

    Synonym properties

    Property to Match The property whose values to compare.
    Weight A factor that signifies the relative importance of the rule.
    Thesaurus URI The location of the thesaurus that is stored in a MarkLogic Server database and used to determine synonyms. Learn more: Managing Thesaurus Documents
    Filter A node in the thesaurus to use as a filter. For example, <thsr:qualifier>birds</thsr:qualifier>.

    Learn more: the $filter parameter in thsr:expand.

    Additional settings for the Double Metaphone match type

    Double Metaphone properties

    Property to Match The property whose values to compare.
    Weight A factor that signifies the relative importance of the rule.
    Dictionary URI The location of the phonetic dictionary that is stored in a database and used when comparing words phonetically. Learn more: Custom Dictionaries
    Distance Threshold The threshold below which the phonetic difference (distance) between two strings is considered insignificant; i.e., the strings are similar to each other. Learn more: spell functions
    Collation The URI to the collation to use. A collation specifies the order for sorting strings. Learn more: Encodings and Collations
    Additional settings for the Zip match type

    Zip properties

    Property to Match The property whose values to compare.
    5-vs-9 Match Weight The weight to use if, given one 9-digit zip code and one 5-digit zip code, the first five digits of the zip codes match. Applicable only to US zip codes.
    9-vs-5 Match Weight The weight to use if, given two 9-digit zip codes, the first five digits match and the last four do not. Applicable only to US zip codes.
    Note: To add a weight when all nine digits match, use the Exact matching type to compare the zip codes as strings. If some of your data might have non-standard formats (e.g., using whitespace or em dash instead of the typical hyphen), remember to map your zip code fields to a standard format in your entity.
    Additional settings for the Reduce match type

    Reduce properties

    Properties to Match One or more properties whose values to compare.
    Weight A positive integer that denotes how much to reduce the weight of a match.
    Additional settings for the Custom match type

    Custom match properties

    Property to Match The property whose values to compare.
    Weight A factor that signifies the relative importance of the rule.
    URI The location of the custom module.
    Function The name of the custom function within the custom module.
    Namespace The namespace of the library module where the custom function is. Blank, if the custom function is JavaScript code.
  3. Configure the Match Thresholds.
    • To add a new threshold, click the Add button.
    • To edit a threshold, click the vertical ellipsis (⋮) and choose Edit Settings.
    • To delete a threshold, click the vertical ellipsis (⋮) and choose Delete.
    Match Thresholds properties

    Match threshold properties

    Name Description
    Name The threshold rule name.
    Weight Threshold The threshold with which to compare the total weight of the matches.
    Action What to do if the total weight exceeds the Weight threshold.
    • Merge. Automatically merges the candidate records, according to the merging rules.
    • Notify. Sends a notification for a human to review the match and decide on the action to take.
    • Custom. Performs actions defined in a custom module.
    Additional settings for the Custom action
    URI The location of the custom module.
    Function The name of the custom function within the custom module.
    Namespace The namespace of the library module where the custom function is. Blank, if the custom function is JavaScript code.
Merging
  1. In the step detail panel at the bottom, click the Merging tab.


  2. Configure the Merge Options.
    • To add a new option, click the Add button.
    • To edit an option, click the vertical ellipsis (⋮) and choose Edit Settings.
    • To delete an option, click the vertical ellipsis (⋮) and choose Delete.
    Merge Options properties
    Name Description
    Merge Type How the merging is done.
    • Standard. Merging is done as specified in the merge rule.
    • Strategy. Merging is done according to a predefined strategy.
    • Custom. Merging is done using a custom function in a custom module.
    Property to Merge The name of the property to merge.
    Additional settings for the Standard merge type

    Settings for the Standard merge type

    Max Values The maximum number of values to allow in the merged property. The default is 99.
    Max Sources The maximum number of data sources from which to get values to merge. For example, to copy values from a single source, set maxSources to 1.
    Source Weights The list of data sources and the weights assigned to them.
    Length Weight The weight assigned to the length of a string.
    Additional settings for the Strategy merge type

    Settings for the Strategy merge type

    Strategy Name The predefined strategy or set of settings to use in merging.
    Additional settings for the Custom merge type

    Settings for the Custom merge type

    URI The location of the custom module.
    Function The name of the custom function within the custom module.
    Namespace The namespace of the library module where the custom function is. Blank, if the custom function is JavaScript code.
  3. Configure the Merge Strategies.
    • To add a new strategy, click the Add button.
    • To edit a strategy, click the vertical ellipsis (⋮) and choose Edit Settings.
    • To delete a strategy, click the vertical ellipsis (⋮) and choose Delete.
    Merge Strategies properties

    Merge Strategy properties

    Name Description
    Default?Yes to make this strategy the default.
    Name The name for the strategy.
    Max Values The maximum number of values to allow in the merged property. The default is 99.
    Max Sources The maximum number of data sources from which to get values to merge. For example, to copy values from a single source, set maxSources to 1.
    Source Weights The list of data sources and the weights assigned to them.
    Length Weight The weight assigned to the length of a string.
  4. (Optional) Set Timestamp Path to the path to a timestamp field within the record.

    This field is used to determine which values to include in the merged property, based on their recency, up to the maximum number specified in the Max Values field in Merge Options (Standard) or in Merge Strategies.

    For example, if Max Values is set to 3, and five records meet the matching criteria, only the most recent three values will be included in the resulting record, and the two oldest values are ignored.

    Note: Namespaces used in the path must be defined within the record.
  5. Configure the Merge Collections.
    • To add a new collection tag, click the Add button.
    • To edit an collection tag, click the vertical ellipsis (⋮) and choose Edit Settings.
    • To delete an collection tag, click the vertical ellipsis (⋮) and choose Delete.
    Merge Collections properties

    Merge Strategy properties

    Column Description
    Event The event that triggers some action on a collection.
    • onMerge. When a merge occurs (automatically or manually) because of a match.
    • onNoMatch. If no match is found in the entire source database/file.
    • onArchive. When a record is archived.
    • onNotification. When a notification is sent or logged.
    Default Collections Collection tags that are added to the resulting records by default.
    Additional Collections Collection tags that you specify to be added to the resulting records. To edit this list, click the vertical ellipsis (⋮) under the Action column and choose Edit.
    Edit Additional Collections dialog