MarkLogic Data Hub 5.1 - Release Notes
Data Hub 5.1.0
Data Hub 5.1.0 includes the following new features and changes:
Mapping
Mapping with XPath Expressions
In the mapping step, entity properties can be assigned values that are derived from XPath expressions which can include predefined or custom functions.
This feature is supported only with MarkLogic Server 9.0-11 or 10.0-2 up to the latest 10.x release.
See About Mapping, Data Hub Mapping Functions, Create Custom Mapping Functions.
Mapping Nested Entities in QuickStart
QuickStart now allows mapping to nested entities.
This feature is supported only with MarkLogic Server 9.0-11 or 10.0-2 up to the latest 10.x release.
See Complex Entities, Configure a Mapping Step Using QuickStart.
Validation of Mapped Entity Instance
In the mapping step, you can validate the resulting mapped entity instance against the schema document based on the entity model.
Mastering
Split Mastering: Matching Step and Merging Step
You can run your mastering process in two separate steps to improve performance: the matching step and the merging step. The split reduces the likelihood that a record is locked when another process needs access to it.
The classic combined mastering (matching and merging in a single step) can still be used for small datasets, but thread count must be set to 1
to avoid locking issues. In most cases, split mastering with multiple threads is ideal.
See About Mastering - Combined-Step versus Split-Step Mastering.
Manual Merging and Unmerging
hubMergeEntities
hubUnmergeEntities
Additional Provenance Information for Mastering
Additional provenance information is stored for mastering to increase transparency in the process.
New REST APIs for Mastering
- mlSmMatch (POST)
- mlSmMerge (POST)
- mlSmMerge (DELETE)
- mlSmNotifications (GET)
- mlSmHistoryDocument (GET)
- mlSmHistoryProperties (GET)
See Data Hub Extensions to the REST Client API - Record Management.
Custom Steps
Expanded Custom Step Types in QuickStart
In QuickStart, you can specify the Custom Step Type of a custom step to be Ingestion, Mapping, and Mastering, and Other to provide more detailed step templates according to the default step that the custom step would replace. This functionality has been available programmatically in previous releases and is now also available in QuickStart.
New Custom Step Templates
New templates are generated by the Gradle task hubCreateStepDefinition for each of the step types (ingestion, mapping, mastering, and custom). Each template includes extensive comments to help you customize it for your needs. You can find the templates in the folder your-project-root/step-definitions.
Other Changes
API prefix ml: to ml
The API prefix ml:
is now prepended to the API name as ml
(without the colon). Example: ml:runFlow
is now mlRunFlow
.