Step configurations are in their own separate files in the your-project-root/steps/step-type directory, and the flow configuration structure includes references to them. |
Step configurations are embedded in the flow configuration structure. |
The mapping configuration is embedded in the mapping step configuration file under $.properties . |
The mapping configuration is in a separate file in the your-project-root/mappings/flow-name-step-name directories. |
The hierarchy in the step definition is flatter without the $.options and $.fileLocations objects. |
Some properties are embedded inside the $.options and $.fileLocations objects. |
Different property names:
sourceFormat
targetFormat
|
Different property names:
inputFileFormat
outputFormat
|
Additional properties:
stepId
selectedSource
lastUpdated
|
|
In the Matching step configuration file, the match rulesets are stored in matchRulesets » matchRules » "matchType": "MATCH-TYPE" .
- Exact. The
matchType key has the value exact . The match type does not use any options.
Example
"matchRulesets": [
{
"name": "name - Exact",
"weight": 3.5,
"matchRules": [
{
"entityPropertyPath": "name",
"matchType": "exact",
"options": {}
}
]
}
],
- Synonym. The
matchType key has the value synonym . The match type uses the thesaurusURI and filter options.
Example
"matchRulesets": [
{
"name": "name - Synonym",
"weight": 3.5,
"matchRules": [
{
"entityPropertyPath": "name",
"matchType": "synonym",
"options": {
"thesaurusURI": "/thesauri/name-synonyms.xml",
"filter": "<qualifier>english</qualifier>"
}
}
]
}
],
- Double Metaphone. The
matchType key has the value doubleMetaphone . The match type uses the dictionaryURI and distanceThreshold options.
Example
"matchRulesets": [
{
"name": "name - Double Metaphone",
"weight": 3.5,
"matchRules": [
{
"entityPropertyPath": "name",
"matchType": "doubleMetaphone",
"options": {
"dictionaryURI": "/nameDictionary.json",
"distanceThreshold": 100
}
}
]
}
],
- Zip. The
matchType key has the value zip . The match type does not use any options.
Example
"matchRulesets": [
{
"name": "name - Zip",
"weight": 1.5,
"matchRules": [
{
"entityPropertyPath": "name",
"matchType": "zip",
"options": {}
}
]
}
],
- Reduce. The
matchType key has the value exact . The weight is negative when the reduce key has the value true . The match type does not use any options.
Example
"matchRulesets": [
{
"name": "name - Reduce",
"weight": 1.5,
"reduce": true,
"matchRules": [
{
"entityPropertyPath": "address",
"matchType": "exact",
"options": {}
},
{
"entityPropertyPath": "lastName",
"matchType": "exact",
"options": {}
}
]
}
],
- Custom. The
matchType key has the value custom . The custom module is defined using algorithmModuleNamespace , algorithmModulePath , and algorithmFunction . The match type does not use any options.
Example
"matchRulesets": [
{
"name": "name - Custom",
"weight": 1.5,
"matchRules": [
{
"entityPropertyPath": "name",
"matchType": "custom",
"algorithmModuleNamespace": "",
"algorithmModulePath": "/custom-modules/matching/nameMatch.sjs",
"algorithmFunction": "nameMatch",
"options": {}
}
]
}
],
|
In the Matching step configuration file, the match rulesets are stored in scoring .
- Exact. Stored in
scoring » add . The match type does not use any options.
Example
"scoring": {
"add": [
{
"propertyName": "lastName",
"weight": "3.5"
}
]
},
- Synonym. Stored in
scoring » expand . The algorithmRef key has the value thesaurus . The match type uses the thesaurus and filter options.
Example
"scoring": {
"expand": [
{
"propertyName": "name",
"algorithmRef": "thesaurus",
"weight": "2.5",
"thesaurus": "/thesauri/name-synonyms.xml",
"filter": "<qualifier>english</qualifier>"
}
]
},
- Double Metaphone. Stored in
scoring » expand . The algorithmRef key has the value double-metaphone . The match type uses the dictionary and distanceThreshold options.
Example
"scoring": {
"expand": [
{
"propertyName": "name",
"algorithmRef": "double-metaphone",
"weight": "2.5",
"dictionary": "/nameDictionary.json",
"distanceThreshold": "100"
}
]
},
- Zip. Stored in
scoring » expand . The algorithmRef key has the value zip-match . The match type uses the zip array that contains origin and weight properties.
Example
"scoring": {
"expand": [
{
"propertyName": "name",
"algorithmRef": "zip-match",
"zip": [
{"origin": "5", "weight": "1.5"},
{"origin": "9", "weight": "1"}
]
}
]
},
- Reduce. Stored in
scoring » reduce . The algorithmRef key has the value standard-reduction . The property array contains the entity properties that must exactly match to reduce the score. The match type does not use any options.
Example
"scoring": {
"reduce": [
{
"allMatch": {
"property" : [ "address", "lastName" ]
}
}
{
"algorithmRef": "standard-reduction",
"weight": "3.5"
}
]
},
- Custom. Stored in
scoring » expand . The algorithmRef key has the value custom-name-match . The custom functions are defined in algorithms » algorithm .
Example
"algorithms": {
"algorithm": [
{
"name": "custom-name-match",
"function": "nameMatch",
"at": "/custom-modules/matching/nameMatch.sjs",
"namespace": ""
}
]
},
"scoring": {
"expand": [
{
"propertyName": "name",
"algorithmRef": "custom-name-match",
"weight": "2.5"
}
]
},
|
In the Matching step configuration file, custom actions (including all other actions) are stored in thresholds . Custom actions are defined using actionModuleNamespace , actionModulePath , and actionModuleFunction .
Example
"thresholds": [
{
"thresholdName": "similarThreshold",
"action": "notify",
"score": 6.5
},
{
"thresholdName": "household",
"action": "custom",
"score": 8.5,
"actionModulePath": "/custom-modules/matching/householdAction.xqy",
"actionModuleNamespace": "http://marklogic.com/smart-mastering/action",
"actionModuleFunction": "household-action"
}
],
|
In the Matching step configuration file, custom actions are stored in actions » action . The action type is stored in thresholds » threshold » action .
Example
"actions": {
"action": [
{
"name": "household-action",
"function": "household-action",
"namespace": "http://marklogic.com/smart-mastering/action",
"at": "/custom-modules/matching/custom-action.xqy"
}
]
},
"thresholds": {
"threshold": [
{
"above": "6.5",
"label": "similarThreshold",
"action": "notify"
},
{
"above": "8.5",
"label": "household",
"action": "household-action"
}
]
},
|
Different custom function configuration for accessing values from the Matching step configuration file:
XQuery Example
declare function algorithm:match-via-tde-row(
$values as item()*,
$match-rule as object-node(),
$match-step as object-node()
) as cts:query*
{
let $property-name := $match-rule/entityPropertyPath
let $entity-type := $match-step/targetEntityType
let $property-column := sem:iri("http://marklogic.com/column/"
|| fn:replace($entity-type, "^.*/([^/]+/[^/]+)$", "$1") || "/" || $property-name)
return
cts:triple-query((), $property-column, $values)
};
JavaScript Example
function matchViaTdeRow(values, matchRule, matchConfiguration)
{
let propertyName = matchRule.propertyName;
let entityType = matchStep.targetEntityType;
let propertyColumn = sem.iri("http://marklogic.com/column/"
+ fn.replace(entityType, "^.*/([^/]+/[^/]+)$", "$1") + "/" + propertyName)
return cts.tripleQuery(null, propertyColumn, values);
};
|
Different custom function configuration for accessing values from the Matching step configuration file:
XQuery Example
declare function algorithm:match-via-tde-row(
$values as item()*,
$expand-rule as element(match:expand),
$match-configuration as element(match:options)
) as cts:query*
{
let $property-name := $expand-xml/@property-name
let $entity-type := $match-configuration/match:target-entity
let $property-column := sem:iri("http://marklogic.com/column/"
|| fn:replace($entity-type, "^.*/([^/]+/[^/]+)$", "$1") || "/" || $property-name)
return
cts:triple-query((), $property-column, $values)
};
JavaScript Example
function matchViaTdeRow(values, expandRule, matchConfiguration)
{
let propertyName = expandRule.propertyName;
let entityType = matchConfiguration.targetEntity;
let propertyColumn = sem.iri("http://marklogic.com/column/"
+ fn.replace(entityType, "^.*/([^/]+/[^/]+)$", "$1") + "/" + propertyName)
return cts.tripleQuery(null, propertyColumn, values);
};
|
In the Merging step configuration file, the last updated date and time is stored in lastUpdatedLocation » documentXPath .
Example
{
"mergeStrategies": [],
"mergeRules": [],
"lastUpdatedLocation": {
"namespaces": {
"es": "http://marklogic.com/entity-services",
"sm": "http://marklogic.com/smart-mastering"
},
"documentXPath": "/es:envelope/es:headers/sm:sources/sm:source/sm:dateTime"
}
}
|
In the Merging step configuration file, the last updated date and time is stored in algorithms » stdAlgorithm » timestamp » path .
Example
{
"algorithms": {
"stdAlgorithm": {
"namespaces": {
"sm": "http://marklogic.com/smart-mastering",
"es": "http://marklogic.com/entity-services"
},
"timestamp": {
"path": "/es:envelope/es:headers/sm:sources/sm:source/sm:dateTime"
}
}
}
}
|
In the Merging step configuration file, merge rulesets are stored in mergeRules .
Example
{
"mergeRules": [
{
"entityPropertyPath": "name",
"maxSources": 1,
"priorityOrder": {
"lengthWeight": 2,
"sources": [
{
"sourceName": "favoriteSource",
"weight": 12
},
{
"sourceName": "lessFavoriteSource",
"weight": 10
}
]
}
}
]
}
|
In the Merging step configuration file, merge rulesets are stored in merging .
Example
{
"merging": [
{
"propertyName": "name",
"algorithmRef": "standard",
"length": {
"weight": "2"
},
"name": "myFavoriteSource",
"maxSources": 1,
"sourceWeights": [
{
"source": {
"name": "favoriteSource",
"weight": "12"
}
},
{
"source": {
"name": "lessFavoriteSource",
"weight": "10"
}
}
]
}
],
"propertyDefs": {
"properties": [
{
"localname": "name",
"name": "name"
}
]
}
}
|
In the Merging step configuration file, custom merge functions are stored in mergeRules . The custom merge function is defined using entityPropertyPath , mergeModulePath , and mergeModuleFunction .
Example
{
"mergeRules": [
{
"entityPropertyPath": "addressLocalName",
"mergeModulePath": "/custom/merge/strategy.sjs",
"mergeModuleFunction": "mergeAddress",
"options": {}
}
]
}
|
In the Merging step configuration file, custom merge functions are stored in algorithms » custom . The custom merge function is defined using name , function , and at .
Example
{
"propertyDefs": {
"properties": [
{
"localname": "addressLocalName",
"name": "addressName"
}
]
},
"algorithms": {
"custom": [
{
"name": "addressAlgorithm",
"function": "mergeAddress",
"at": "/custom/merge/strategy.sjs"
}
]
},
"merging": [
{
"propertyName": "addressName",
"algorithmRef": "addressAlgorithm"
}
]
}
|
Different custom function configuration for accessing values from the Merging step configuration file:
XQuery Example
declare function algorithm:custom-merge-limit(
$property-name as xs:QName,
$properties as map:map*,
$merge-rule as object-node()
) as map:map*
{
let $default-limit := if ($merge-rule/entityPropertyPath = "Phone") then 5 else 10
return fn:subsequence(
$properties,
fn:head(($merge-rule/maxValues, $default-limit))
)
};
JavaScript Example
function customMergeLimit(propertyName, properties, mergeRule)
{
let defaultLimit = mergeRule.entityPropertyPath === "Phone" ? 5 : 10;
return fn.subsequence(
properties,
mergeRule.maxValues || defaultLimit
);
}
|
Different custom function configuration for accessing values from the Merging step configuration file:
XQuery Example
declare function algorithm:custom-merge-limit(
$property-name as xs:QName,
$properties as map:map*,
$merge-rule as element(merging:merge)
) as map:map*
{
let $default-limit := if ($merge-rule/@property-name = "Phone") then 5 else 10
return fn:subsequence(
$properties,
fn:head(($merge-rule/@max-values, 5))
)
};
JavaScript Example
function customMergeLimit(propertyName, properties, mergeRule)
{
let defaultLimit = mergeRule.propertyName === "Phone" ? 5 : 10;
return fn.subsequence(
properties,
mergeRule.maxValues || defaultLimit
);
}
|
To create a flow, run the Gradle task hubCreateFlow without -PwithInlineSteps . It creates multiple files: one for the flow configuration and one for each of the example step configurations.
Learn more: Create Flow Using Gradle
|
To create a flow, run the Gradle task hubCreateFlow with -PwithInlineSteps=true . It creates a single file containing the flow configuration with embedded example steps.
Learn more: Create Flow Using Gradle
|
To create a step and add it to a flow, run the Gradle task hubCreateStep and hubAddStepToFlow.
Learn more: Create Steps Using Gradle - HC Format
|
To create a step and add it to a flow, run the Gradle task hubCreateStepDefinition and manually copy the step configuration structure from the new step definition file to the appropriate location in the flow configuration structure.
|
Details |
Details |