Data Hub Glossary
A custom module that performs additional processes in its own transaction before or after the core step transaction. Results are saved within each transaction. Learn more: About Interceptors and Custom Hooks
custom hook module
Custom code that can perform tasks outside the scope of Data Hub, either immediately before or immediately after a step's main processes. Learn more: About Custom Modules
custom step module
Custom code that can override default processes or perform additional tasks as the main component of a custom step in a flow. Learn more: About Custom Modules
An abstraction of a logical business object that can be stored and manipulated by applications. For example, a sales model might include entities such as a customer, order, or inventory item. Learn more: Entities
A concrete instantiation of an entity type, as represented by a populated data structure representing an individual entity, or a document containing such a data structure.
A logical relationship between entity types. For example, an order entity type might include relationships with a customer and inventory item entities. In Entity Services, an entity relationship is expressed as an entity property whose type is an entity type (rather than scalar or array type). Learn more: Relationship Type and Defining Entity Relationships.
An out-of-the-box API and a set of conventions you can use within MarkLogic to quickly set up an application based on entity modeling.
A custom data type that defines the characteristics of an entity instance, including its properties and relationships to other entities. Learn more: Entities
A JSON structure that contains the settings for a specific flow. The structure includes references to other files that contain the step configurations. Learn more: About Flow and Step Configuration Structures
The process that logs information about the flows as they run. Inputs to and outputs from every plugin of every flow are recorded into the JOBS database.
A custom module that performs additional processes after the core step processes and before the results are saved. Learn more: About Interceptors and Custom Hooks
The Data Hub step that associates the fields in the entity model with the corresponding fields in your source data. Related term: step
A valid XPath expression that can include functions and can use values from one or more source fields to assign a calculated value to an entity property during the mapping process.
The Data Hub step that uses the MarkLogic Smart Mastering technology that checks for possible duplicates in your data and manages them accordingly based on specified criteria. The classic mastering step is comprised of matching (to determine if two or more records refer to the same entity) and merging (to determine how to merge the records that refer to the same entity). If you have a very large dataset with a possibility of very many merges, you can improve the performance by mastering in two steps: a matching step and a merging step. Related term: step
The first process in a Data Hub mastering step which checks for possible duplicates in your data based on your criteria, which are defined as rules and thresholds. Related term: merging
The second process in a Data Hub mastering step which determines how matching records would be combined. Related term: matching
provenance and lineage
The automated process that ensures that the data can be traced back to its origin and that the source data is preserved. Related term: Flow tracing
A link to another entity type that can be used as the data type of an entity property.
Code that processes or enhances the data. A step can be an ingestion step, a mapping step, a matching step, a merging step, a mastering step, or a custom step. Learn more: About Steps
A JSON structure that contains the settings for a specific step instance. The structure is in its own file, which the flow configuration refers to. Learn more: About Flow and Step Configuration Structures
A JSON structure that serves as a template for a step configuration of a specific type (ingestion/loading, mapping, matching, merging, mastering, or custom). Step definitions are stored in files separate from the flow configuration. Learn more: About Steps
A custom structure that can be used as the data type of an entity property. A structured type is essentially a nested entity type within another entity type and can be reused in other entity types.
Project files or records containing the configuration settings for Data Hub, including entity models, flows, step definitions, and steps. Related term: user data
Documents that have been ingested or processed by flows and steps, as well as match summaries produced by the matching step. User data does not include user artifacts. Related term: user artifact