Modeling with Hub Central

Overview

A typical Data Hub data flow involves the following operations:
  1. Load/Ingest your raw data into MarkLogic Server.
  2. Create an entity model to standardize your data fields.
  3. Map the fields in your raw data to the fields of the entity model.
  4. (Optional) Match and merge duplicates.

The entity model is a core component of data integration in MarkLogic Data Hub. It defines the standard structures (entity types) that would be populated with values from your raw data, so that your data components can be accessed uniformly regardless of the format and structure of the source.

Note: You must create one or more entity types before you can define the mappings between your source data and the curated data.

The entity type is comprised of entity properties, which can be of any of the following types:

  • A basic data type, including integer, string, dateTime, boolean, and other less common string, number, and date types (under More string types, More number types, and More date types).
  • A structured type, which is comprised of its own properties, which can also be of other structured types. Use a structured type if the values of its properties will change for every entity instance. For example, the property FullName of the entity Employee could be a structured type, because each employee would have a different full name.

    The depth of nested structured types is not limited.

  • A relationship type, which links to an entity of the selected type. Use a relationship type if the properties of the target entity must be the same for all entities that point to it. For example, the entity Order could point to the entity Product as a relationship type, because the SKU and product description will be the same for all customer orders.

Hub Central - expanded Customer entity type with structured types all the way down

Learn more: Entities

Security

You must be assigned the following security roles:

  • To view, create, edit, or delete an entity model: Hub Central Modeler

Or any role that inherits the required role. See Users and Roles.

Modeling Process

To integrate your data,

  1. To create an entity type, see Create an Entity Type.
  2. Use the entity type to curate your data.

Managing Entity Types


Hub Central - Entity Type list

To edit an entity type, see Edit Entity Type.

To manage the properties of an entity type, see Manage Entity Properties.

To save changes:

  • Go to the Model area of Hub Central.
    Learn how.
    1. Go to your Hub Central endpoint.
    2. In the icon bar, click the Model icon ().
      Hub Central - icon bar - Model

  • To save your changes to a single entity type, click the Save icon () for that entity type.
  • To save all your changes to all entity types, click the Save All button.

To undo changes and revert to the last saved version:

  • Go to the Model area of Hub Central.
    Learn how.
    1. Go to your Hub Central endpoint.
    2. In the icon bar, click the Model icon ().
      Hub Central - icon bar - Model

  • To undo your changes and revert to the last saved version of a single entity type, click the Revert icon () for that entity type.
  • To undo all your changes and revert to the last saved versions of all entity types, click the Revert All button.

To delete an entity type,

  1. Go to the Model area of Hub Central.
    Learn how.
    1. Go to your Hub Central endpoint.
    2. In the icon bar, click the Model icon ().
      Hub Central - icon bar - Model

  2. In the list, click the Delete icon () for the entity type to delete.
Important:

If an entity type has already been used to map or match and merge data, then modifying or deleting it might trigger a reindexing of all your curated data and might affect the results of mapping and mastering processes that are occurring concurrently.

MarkLogic recommends scheduling a time to modify or delete used entity types when the impact would be minimal.