Data Hub Extensions to the REST Client API
This page provides the list of Data Hub REST Client APIs that extend the MarkLogic REST Client API.
Administration
Returns the version of the Data Hub installed in your MarkLogic Server instance.
Returns true
if debugging is currently enabled for Data Hub in the MarkLogic Server instance; otherwise, false
.
- rs:enable
- (Required) If
true
, enables debugging for Data Hub in the MarkLogic Server instance; otherwise, debugging is disabled. Default isfalse
.
Flow Management
- rs:flow-name
- (Required) The name of the flow.
- rs:step
- (Required) The sequence number of the step that specifies the Source Collection or the Source Query.
- rs:database
- The name of the database to search. Default is the source database specified in the step.
- rs:options
- A JSON object containing additional options.
Runs a step within the flow to process the specified records.
- rs:job-id
- A unique job ID to associate with the flow run. This option can be used if the flow run is part of a larger process (e.g., a process orchestrated by NiFi with its own job/process ID). Must not be the same as an existing Data Hub job ID. If not provided, a unique Data Hub job ID will be assigned.
- rs:flow-name
- The name of the flow.
- rs:step
- The sequence number of the step to execute. To run multiple specific steps, use your orchestration tool to send one mlRunFlow request for each step.
- rs:options
- A JSON object containing additional options to pass to the flow.
- To specify the list of records to process, add the key uris whose value is an array of the URIs of the records to process.
Record Management
MarkLogic Data Hub provides a REST Client API extension which allows you to match and merge/unmerge records programmatically without running a flow.
Compares the specified record with other records and returns the list of possible matches.
- rs:uri
- (Required) The URI of the record to compare with other records.
- rs:flowName
- (Required) The name of a flow that includes a mastering step.
- rs:step
- The step number of the mastering step in the specified flow. This task uses the settings in the mastering step. Default is 1, which assumes that the first step in the flow is a mastering step.
- rs:includeMatchDetails
- If
true
, additional information about each positive match is provided. Default isfalse
. - rs:start
- The index of the first notification to return. Default is 1.
- rs:pageLength
- The number of notifications to return. Default is 20.
Merges the specified records according to the settings of the specified mastering step.
- rs:uri
- (Required) The URI of one of the records to merge. You must specify at least two URIs.
- rs:flowName
- (Required) The name of a flow that includes a mastering step.
- rs:step
- The step number of the mastering step in the specified flow. This task uses the settings in the mastering step. Default is 1, which assumes that the first step in the flow is a mastering step.
- rs:preview
- If
true
, no changes are made to the database and a simulated merged record is returned; otherwise, the merged record is saved to the database. Default isfalse
.
Reverses the set of merges that created the specified merged record.
- rs:mergeURI
- (Required) The URI of the record to unmerge.
- rs:retainAuditTrail
- If
true
, the merged record will be moved to an archive collection; otherwise, it will be deleted. Default istrue
. - rs:blockFutureMerges
- If
true
, the component records will be blocked from being merged together again. Default istrue
.
Returns the list of notifications about matches that are close to but did not exceed the merging threshold.
- rs:start
- The index of the first notification to return. Default is 1.
- rs:pageLength
- The number of notifications to return. Default is 10.
Returns the document-level history of the specified merged record.
- rs:uri
- (Required) The URI of a merged record.
Returns the history of the specified property or all properties of a merged record.
- rs:uri
- (Required) The URI of a merged record.
- rs:property
- The name of the specific property. Default is all properties.
"provenanceGranularityLevel" : "fine"
. See Set Provenance Granularity ManuallyJob Management
Returns job information based on the specified parameters.
- rs:job-id
- A unique job ID to associate with the flow run. This option can be used if the flow run is part of a larger process (e.g., a process orchestrated by NiFi with its own job/process ID). Must not be the same as an existing Data Hub job ID. If not provided, a unique Data Hub job ID will be assigned. Used to return the job document associated with the specified job ID. You can specify either jobID or status, but not both.
- rs:status
- The status of the job:
started
,finished
,finished_with_errors
,running
,failed
,stop-on-error
, orcanceled
. Used to return the list of job documents associated with all jobs with the specified status. You can specify either jobID or status, but not both. - rs:flowNames
- The name of the flow. Used to return the job ID and job information of the latest run that includes the specified flow name. To specify additional flow names, repeat the parameter.
- rs:flow-name
- The name of the flow. Used to return the list of job documents associated with the all runs that include the specified flow name.
Returns the batch documents for the specified step or batch within the specified job.
- rs:jobid
- (Required) A unique job ID to associate with the flow run. This option can be used if the flow run is part of a larger process (e.g., a process orchestrated by NiFi with its own job/process ID). Must not be the same as an existing Data Hub job ID. If not provided, a unique Data Hub job ID will be assigned.
- rs:step
- (Required) The sequence number of the step whose batch documents to return. You must specify either step or batchId, but not both.
- rs:batchid
- (Required) The ID of the batch whose documents to return. You must specify either step or batchId, but not both.