Entities

Overview

Entities are the high-level business objects in your enterprise. For example, employee, product, purchase order, or department.

In MarkLogic Data Hub, you can use MarkLogic's Entity Services to create models of your business entities and to generate code scaffolding, database configurations, index settings, and validations based on those models. Entity Services handles the model definition and entity instance envelope documents through API calls. If you use your own abstract entities, you must provide this framework.

Note: MarkLogic strongly recommends that you use Entity Services unless you have specific needs that it cannot address.

MarkLogic Data Hub groups all the data about an entity into one consolidated record (with context and history), thereby providing a 360° view of data, across silos.

Note: An entity model is not required to ingest your raw data. However, it is required to configure your mapping.
You can create an entity:

Complex Entities

You can define your entity model to include another entity or to link to another.

  • A local entity reference refers to an entity that is nested within the host entity as the value of a property. Use a nested entity if the values of its properties will change for every host entity. For example, an entity FullName can be nested in the entity Employee, because each employee would have a different full name.
  • An external entity reference refers to an entity that is stored separately, and the host entity contains only a link to it. Link to an entity if the values of its properties must remain the same for all host entities. For example, an entity Product can be linked in the entity Order, because the SKU and product description will be the same for all customer orders.
  • When mapping a nested entity (local entity reference), the type of the source data element must match one of the following:
    • an object, if Cardinality is set to 1..1, or
    • an array of objects, if Cardinality is set to 1..∞.
  • When mapping a linked entity (external entity reference), the type of the source data element must be a string which holds the URI of the linked entity.

Each nested entity object includes the name of the entity model as well as its properties. In the following example, the entity Person contains the properties name and address. The property name holds a nested entity based on the entity model FullName.

   "instance": {
    ...,
    "Person": {
      "name": {
        "FullName": {
          "title": "Mr.",
          "first": "John",
          "middle": "Doe",
          "last": "Smith"
        }
      },
      "address": "..."
    }
  }

Entity Model Validation

Only valid entity models are loaded into the MarkLogic Server instance. A valid entity model meets the following requirements:

  • info/baseUri in the model must have a value that is a valid URI.
  • The model must pass the Entity Services validation function es:model-validate.

If the entity model is valid and loaded, Data Hub creates the following in the STAGING and FINAL databases:

  • schema files:
    • /entities/YourEntityName.entity.json. The validated entity model returned by es:model-validate.
    • /entities/YourEntityName.entity.schema.json. The entity model as a JSON schema for use in XDMP validation (xdmp.jsonValidate).
    • /entities/YourEntityName.entity.xsd. The entity model as a XSD schema for use in XDMP validation (xdmp.validate).
  • a TDE template (/tde/YourEntityName-YourEntityVersion.tdex) with the following permissions:
    • read permissions for the data-hub-operator
    • update permissions for the data-hub-developer
    • default permissions allowed to the user who loaded the entity model to MarkLogic Server
    Important: In MarkLogic 10 and later versions, a user assigned to the admin role only must be explicitly given read access to view the TDE template.

    If a TDE template with the same URI already exists but is not in the ml-data-hub-tde collection, a new TDE template will not be created.

Additional Artifacts

Additional artifacts associated with an entity model can be generated:

  • Database property files
  • PII protected paths and query rolesets
  • Search option files

Using QuickStart: When an entity model is saved in QuickStart, these artifacts are automatically generated and stored in the local project structure. If you click Yes to update the indexes after the entity model is saved, these artifacts are also deployed to the MarkLogic Server.

Using Gradle: You can also generate and deploy these artifacts using Gradle tasks.

Artifacts Gradle Tasks
Database property files that include indexes you selected for the properties of all entity models:
  • /your-project-root/src/main/entity-config/databases/staging-database.json
  • /your-project-root/src/main/entity-config/databases/final-database.json
Protected paths and query rolesets for properties marked as Personally Identifiable Information (PII).

See Enable PII Manually for complete instructions.

Search option files:
  • /your-project-root/src/main/entity-config/staging-entity-options.xml
  • /your-project-root/src/main/entity-config/final-entity-options.xml
To generate and deploy: mlLoadModules