Set Provenance Granularity Manually

Data Hub provides three levels of granularity for provenance information: coarse (default), fine, and off.

Provenance tracking only applies to mapping, mastering, custom-mapping, and custom-mastering steps.

"provenanceGranularityLevel" : "coarse" "provenanceGranularityLevel" : "fine" "provenanceGranularityLevel" : "off"
Document-level provenance information is always provided by default. Both document-level and property-level provenance information are provided. Provenance information for the current flow or step is not stored. However, previously collected provenance information is retained; database administrator permissions are required to delete existing provenance information.
CAUTION: Do not turn off provenance unless you are certain the project will never make use of provenance information.
The set of document-level provenance information to track is not customizable. The set of property-level provenance information to track can be customized for custom steps only. A default non-customizable set is used for mapping and mastering steps.
In a mastering step, the merged document includes provenance information from all its source records, in addition to provenance information about the mastering step run.

In a mapping step, provenance information includes a field's XPath in its original document and its associated property in the entity model instance.

In a mastering step, provenance information includes paths to the original records that provided the resulting value(s) of a property in the merged document.

In a mastering step, the merged document includes provenance information from all its source records, but not provenance information about the mastering step run.
Note: All provenance documents are stored in the data-hub-JOBS database and are added to the protected collection http://marklogic.com/provenance-services/record.

About this task

This task adds the provenance granularity setting to the flow definition file. If you want the default document-level provenance information only, no action is required.

Procedure

  1. Edit the flow definition file.
  2. Locate the mapping, mastering, or custom step for which you want more granular provenance tracking.

    You can also add the setting to the flow settings once to apply to all mapping, mastering, and custom steps in the flow. However, the value in the step settings, if specified, overrides the value in the flow settings for that step.

  3. Under the options node, set provenanceGranularityLevel to coarse, fine, or off.
    Example: "provenanceGranularityLevel" : "fine"
  4. Save.

What to do next

If the step is a custom step, you can also specify the set of property-level provenance information that is tracked. See Provenance in a Custom Step.