Entities
Overview
Entities are the high-level business objects in your enterprise. For example, employee, product, purchase order, or department.
In MarkLogic Data Hub, you can use MarkLogic's Entity Services to create models of your business entities and to generate code scaffolding, database configurations, index settings, and validations based on those models. Entity Services handles the model definition and entity instance documents through API calls. If you use your own abstract entities, you must provide this framework.
An entity model is comprised of entity types.
- An entity type is a data type to which each record of your data will be standardized, so that data from all sources can be viewed and used uniformly. An entity type is comprised of entity properties.
- An entity property is a data field in a record of your data.
- An entity instance is a copy of the entity type structure, which is added to your data record during mapping and then populated with values from the raw data.
All the data about an entity is consolidated in a single record, which contains the standardized entity instance, the original raw data, the provenance and lineage, and other metadata. The provenance and lineage information includes the changes that occurred from the raw data to the most recent entity instance.
See:
Entity Model Validation
Only valid entity models are loaded into the MarkLogic Server instance. A valid entity model meets the following requirements:
info/baseUri
in the model must have a value that is a valid URI.- The model must pass the Entity Services validation function
es:model-validate
.
If the entity model is valid and loaded, Data Hub creates the following in the STAGING and FINAL databases:
- schema files:
/entities/YourEntityName.entity.json
. The validated entity model returned byes:model-validate
./entities/YourEntityName.entity.schema.json
. The entity model as a JSON schema for use in XDMP validation (xdmp.jsonValidate
)./entities/YourEntityName.entity.xsd
. The entity model as a XSD schema for use in XDMP validation (xdmp.validate
).
- a TDE template (
/tde/YourEntityName-YourEntityVersion.tdex
) with the following permissions:read
permissions for the data-hub-operatorupdate
permissions for the data-hub-developer- default permissions allowed to the user who loaded the entity model to MarkLogic Server
Important: In MarkLogic 10 and later versions, a user assigned to the admin role only must be explicitly givenread
access to view the TDE template.If a TDE template with the same URI already exists but is not in the
ml-data-hub-tde
collection, a new TDE template will not be created.
Guidance on Mapping Multiple Entities from a Single Source
The topics below illustrate how to create different types of entity relationships using the Hub Central one-to-many modeling and mapping tools.
Creating a 1:1 Relationship
You can create a 1:1 relationship between entities when an entity instance is related to a single other entity instance.
For example, the Customer entity is related to the BabyRegistry entity, and each customer maintains a single baby registry. The relationship is 1:1.
Here are the modeled entities before defining the 1:1 relationship:
Customer
customerId integer
firstName string
lastName string
BabyRegistry
babyRegId integer
arrivalDate date
To model the 1:1 relationship, define a property in one of the participating entities that points to the related entity. The other entity in the relationship is untouched.
Here, the BabyRegistry property is defined in the Customer entity:
Customer
customerId integer
firstName string
lastName string
created BabyRegistry
Alternatively, a Customer property can be defined in the BabyRegistry entity:
BabyRegistry
babyRegId integer
arrivalDate date
createdBy Customer
Which entity the one chooses to hold the property may be based on the lifecycle or complexity of the entities involved.
When defining the mapping, point to the related entity's primary key to establish the relationship. Here is how the mapping is defined when the reference property is in the Customer entity:
Customer
customerId /CustomerID
firstName /Name/FirstName
lastName /Name/LastName
created /BabyRegistry/BabyRegistryId
Here is how it is defined when the reference property is in the BabyRegistry entity:
BabyRegistry
babyRegId /BabyRegistry/BabyRegistryId
arrivalDate /BabyRegistry/Arrival_Date
createdBy /CustomerID
Creating a N:1 Relationship
You can create a N:1 relationship between entities when one or more entity instances are related to one instance of another entity.
In our example, the Order entity is related to the Customer entity, and each customer can create multiple orders. Each order is associated with a single customer. The relationship is N:1.
Here are the modeled entities before defining the 1:N relationship:
Customer
customerId integer
firstName string
lastName string
Order
orderId integer
timestamp dateTime
To model the 1:N relationship, create a property on the many (N) side of the relationship that points to the other entity. In our example, the Order is on the many side, so we create a property in Order that points to Customer:
Order
orderId integer
timestamp dateTime
orderedBy Customer
When defining the mapping, the user points the property on the many side to the other entity's primary key. In our example, the Order property will point to the Customer primary key:
Order
orderId /Orders/OrderId
timestamp /Orders/DateAndTime
orderedBy /CustomerID
Creating an N:M Relationship
You can create an N:M relationship between entities when an entity instance is related to one or more instances of another entity and vice-versa.
In our example, the Order and Product entities are related. An order can have one or more products and a product can be part of one or more orders. The relationship is N:M.
Here are the modeled entities before defining the relationship:
Order
orderId integer
timestamp dateTime
Product
productId integer
name string
To model the N:M relationship, you define a property in one of the participating entities that points to the related entity. Similar to the 1:1 case above, the relationship can be modeled by defining a property in either of the participating entities. (The other entity in the relationship is left untouched.)
Here, a Product property is defined in the Order entity:
Order
orderId integer
timestamp dateTime
includes Product yes
The cardinality of the property is also marked as multiple.
Alternatively, an Order property can be defined in the Product entity:
Product
productId integer
name string
isPartOf Order yes
Which entity the user chooses to hold the property may be based on the lifecycle or complexity of the entities involved.
When defining the mapping, point to the related entity's primary key to establish the relationship. Here is how the mapping is defined when the reference property is in the Order entity:
Order
orderId /Orders/OrderId
timestamp /Orders/DateAndTime
includes /Orders/Products/ProductId
Here is how it is defined when the reference property is in the Product entity:
Product
productId /Orders/Products/ProductId
name /Orders/Products/Name
isPartOf /Orders/OrderId
Because the relationship is N:M, the properties in the resulting entity instances will have arrays as their datatypes, and the arrays will hold one or more values based on the number of related entities.