Loading TOC...
Entity Services Developer's Guide (PDF)

MarkLogic 10 Product Documentation
Entity Services Developer's Guide
— Chapter 3

Creating and Managing Models

This chapter covers entity model description management tasks. A model descriptor defines entity types, their properties, and relationships between entities. The following topics are covered:

Introduction

A fully constructed model consists of a set of facts about the modeled entity types, their properties, and the relationships between them. The facts are represented in MarkLogic as semantic triples.

The entity types, properties, and relationships are defined by an XML or JSON model descriptor. When you persist the descriptor in the database in the prescribed way, MarkLogic automatically creates the model by generating facts about the model, expressed as triples. You can also add your own facts (triples) to the model.

The following diagram depicts the building blocks of an entity model in MarkLogic:

Building a model involves the following steps:

  1. Define your entity types, entity type properties (attributes), and relationships in a model descriptor. For details, see Writing a Model Descriptor.
  2. Optionally, validate your descriptor. An invalid descriptor will produce an invalid model, so it is a good idea to validate the descriptor during development. For details, see Validating a Model Descriptor.
  3. Create a model by persisting the descriptor as a document in the special Entity Services collection. MarkLogic automatically generates facts about your entity types. For details, see Creating a Model from a Model Descriptor.
  4. Optionally, extend the model with additional facts. Extending a Model with Additional Facts.

The following diagram is a pictorial representation of this process.

Once you have a valid descriptor or model, you can use the Entity Services API to generate code and other artifacts that provide a foundation for creating an application based on your model. You can use the API to create the following:

  • A framework for transforming data from heterogeneous sources into canonical entity instances.
  • A Template Driven Extraction (TDE) template for interfacing with your instance data as rows or triples. The template facilitates querying your instances using SQL, SPARQL, or the Optic API.
  • A framework for converting instances from one version of your model to another as your model evolves and changes.
  • Index configuration and other database configuration properties that facilitate querying your model, based on characteristics you define.
  • Query options that facilitate full text search of your entity instances using the XQuery Search API or the REST, Java, and Node.js Client APIs.

For more details, see Generating Code and Other Artifacts.

Writing a Model Descriptor

This section describes how to define a model descriptor containing entity type definitions and model metadata. This section includes the following topics:

Model Descriptor Basics

A model descriptor is an XML element or JSON object that defines one or more entity types, model metadata, and relationships between entity types. You can generate code templates and configuration artifacts from the descriptor in the form of either a JSON object-node or a json:object (a special kind of map:map).

A model descriptor has two parts: The info section contains model metadata, such as a title and version; the definitions section contains entity type definitions, including entity properties and relationships, plus type-specific metadata.

A descriptor must define at least one entity type and can define multiple entity types. Each type definition can include additional metadata to guide code and artifact generation. For details, see Entity Type Definition Overview.

The entity type property names in your model should be unique, even across entity types to avoid name collisions in generated code and artifacts.

The natural representation for a model descriptor is JSON because it already matches the internal representation of the model. When you use an XML model descriptor, you must call one of the following functions to translate your descriptor into a form usable with Entity Services functions that accept a model as input.

For more details, see Working With an XML Model Descriptor.

You might find it useful to generate test entity instances during model development so you can see a concrete example of the default entities produced by your model. For details, see Generating Test Entity Instances.

The following is an example of the simplest possible model descriptor. The descriptor must contain at least title and version metadata in the info section, and define at least one entity type with at least one property in the definitions section. In this example, the model named Example defines an entity type named Person. A Person entity has an id property.

Format Descriptor Example
JSON
{ "info": {
    "title": "Example",
    "version": "1.0.0"
  },
  "definitions": {
    "Person": {
      "properties": {
        "id": { "datatype": "int" }
      }
}}}
XML
<es:model xmlns:es="http://marklogic.com/entity-services">
  <es:info>
    <es:title>Example</es:title>
    <es:version>1.0.0</es:version>
  </es:info>
  <es:definitions>
    <Person>
      <es:properties>
        <id>
          <es:datatype>int</es:datatype>
        </id>
      </es:properties>
    </Person>
  </es:definitions>
</es:model>

Entity Type Definition Overview

An entity type definition usually includes one or more entity property definitions and can include the type metadata such as a primary key specification. This section provides a brief overview of defining an entity type. For syntax details, see entity_type_definition

All property definitions must include either a data type or a reference to another entity type. The data type of a property can be string, array, iri, or one of several XSD types. Depending on the data type, a property definition may require additional information. For details, see Writing a Model Descriptor and property_definition.

The data type of an entity property can be any of the following:

  • Any of the XSD types listed in property_type.
  • A reference to another entity type.
  • An IRI.
  • A homogeneous array of items of any of these types.

Depending on the type, the property definition can include additional information. For example when the datatype is string, you can specify a collation. For syntax details, see property_definition.

An entity type definition can include the following type-specific metadata that is used when generating code and configuration artifacts:

Property names should be unique across all the entity types in a model. Duplicate property names can lead to name collisions in generated code and artifacts, causing some code and configuration to be generated commented out.

For example, the following model descriptor defines a Person entity with two required entity properties (id and name) and two optional entity properties (address and rating). The id property is the primary key. In addition, the descriptor specifies that a path range index configuration and query options should be generated for the rating property.

Language Example
JSON
{ "info": { "title": "Example", "version": "1.0.0" },
  "definitions": {
    "Person": {
      "description": "Example person entity type",
      "properties": {
        "id": { "datatype": "int" },
        "name": { "datatype": "string" },
        "address": { "datatype": "string" },
        "rating": { "datatype": "float" }
      },
      "required": ["id", "name"],
      "primaryKey": "id",
      "pathRangeIndex": ["rating"]
    }
  }
}
XML
<es:model xmlns:es="http://marklogic.com/entity-services">
  <es:info>
    <es:title>Example</es:title>
    <es:version>1.0.0</es:version>
  </es:info>
  <es:definitions>
    <Person>
      <es:description>Example person entity type</es:description>
      <es:properties>
        <id><es:datatype>int</es:datatype></id>
        <name><es:datatype>string</es:datatype></name>
        <address><es:datatype>string</es:datatype></address>
        <rating><es:datatype>float</es:datatype></rating>
      </es:properties>
      <es:required>id</es:required>
      <es:required>name</es:required>
      <es:primary-key>id</es:primary-key>
      <es:path-range-index>rating</es:path-range-index>
    </Person>
  </es:definitions>
</es:model>

Defining an Entity Property with a SimpleType

To define an entity type property with a simple type such as string, integer, or date, specify the type name as the value of the datatype JSON property or XML element. For a complete list of supported type names, see property_type.

Not all the supported data types are usable as range index or word lexicon types. If you specify an entity property with an incompatible type in the range index or word lexicon specification of an entity type definition, then the resulting model will not validate.

For example, the following entity type definition contains entity properties with four different simple types.

Format Example
JSON
{ "info": { "title": "Example", "version": "1.0.0"},
  "definitions": {
    "Person": {
      "properties": {
        "id": { "datatype": "positiveInteger" },
        "name": { "datatype": "string" },
        "birthdate": { "datatype": "date" },
        "rating": { "datatype": "float" }
      }
}}}
XML
<es:model xmlns:es="http://marklogic.com/entity-services">
  <es:info>
    <es:title>Example</es:title>
    <es:version>1.0.0</es:version>
  </es:info>
  <es:definitions>
    <Person>
      <es:properties>
        <id><es:datatype>positiveInteger</es:datatype></id>
        <name><es:datatype>string</es:datatype></name>
        <birthdate><es:datatype>date</es:datatype></birthdate>
        <rating><es:datatype>float</es:datatype></rating>
      </es:properties>
    </Person>
  </es:definitions>
</es:model>

If the type name is string, then you can optionally include a collation URI to be used when generating index, lexicon, and query option configuration artifacts from the model. If you omit the collation for a string-typed entity property, the collation defaults to http://marklogic.com/collation/en.

The following example demonstrates including a collation in an entity property definition.

Format Example
JSON
{ "info": { "title": "Example", "version": "1.0.0"},
  "definitions": {
    "Person": {
      "properties": {
        "name": { 
          "datatype": "string", 
          "collation": "http://marklogic.com/collation/"
        }
      }
}}}
XML
<es:model xmlns:es="http://marklogic.com/entity-services">
  <es:info>
    <es:title>Example</es:title>
    <es:version>1.0.0</es:version>
  </es:info>
  <es:definitions>
    <Person>
      <es:properties>
        <name>
          <es:datatype>string</es:datatype>
          <es:collation>http://marklogic.com/collation/</es:collation>
        </name>
      </es:properties>
    </Person>
  </es:definitions>
</es:model>

Defining an Entity Property with a Complex Type

To specify an entity property whose type is complex, such as an object type, define the complex type as an entity type and use an entity type reference in the property definition.

For example, suppose a Person entity type contains a name property, and that name should have entity properties first, middle, and last. You could model a name as an entity type and then reference it in the definition of Person similar to the following:

JSON: "name": { "$ref": "#/definitions/Name" }

XML: <name><es:ref>#/definitions/Name</es:ref></name>

You can reference entity types defined in the same model (a local reference) or externally. For more details and examples, see Defining Entity Relationships.

Defining an Entity Property with Array Type

To specify an entity property whose type is a list of values, specify array in the datatype JSON property or XML element of the property definition, and then include an items type definition that specifies the data type of the list items. For a list of supported item type names, see property_type.

You cannot use an entity property with array type as a primary key or for generating database configuration artifacts such as range index or word lexicon configuration.

For example, the following entity type definition defines an entity property named orders whose value is an array of values of type integer.

Format Example
JSON
{ "info": { "title": "Example", "version": "1.0.0"},
  "definitions": {
    "Person": {
      "properties": {
        "orders": { 
          "datatype": "array", 
          "items": {
            "datatype": "integer"
          }
        }
      }
}}}
XML
<es:model xmlns:es="http://marklogic.com/entity-services">
  <es:info>
    <es:title>Example</es:title>
    <es:version>1.0.0</es:version>
  </es:info>
  <es:definitions>
    <Person>
      <es:properties>
        <orders>
          <es:datatype>array</es:datatype>
          <es:items>
             <es:datatype>integer</es:datatype>
           </es:items>
        </orders>
      </es:properties>
    </Person>
  </es:definitions>
</es:model>

For more details, see property_definition.

Defining an IRI Entity Property

To model the type of an entity property as an IRI (Internationalized Resource Identifier), specify iri in the datatype JSON property or XML element of the property definition. IRI-typed entity properties can be useful for working with entities using SPARQL.

The value of a property with IRI type must be a string that represents a sem:iri value. The value is opaque to the Entity Services API.

For example, the following entity type definition contains an entity property name with IRI data type.

Format Example
JSON
{ "info": { "title": "Example", "version": "1.0.0"},
  "definitions": {
    "Person": {
      "properties": {
        "name": { "datatype": "iri" }
      }
}}}
XML
<es:model xmlns:es="http://marklogic.com/entity-services">
  <es:info>
    <es:title>Example</es:title>
    <es:version>1.0.0</es:version>
  </es:info>
  <es:definitions>
    <Person>
      <es:properties>
        <name><es:datatype>iri</es:datatype></name>
      </es:properties>
    </Person>
  </es:definitions>
</es:model>

For more details about creating Semantic applications in MarkLogic, see the Semantics Developer's Guide.

Identifying the Primary Key Entity Property

An entity type definition can designate one entity property as a primary key that uniquely identifies each instance of that type.

The primary key is used in the following ways:

  • Primary and foreign key for SQL views of your instance data. If you generate a TDE template from the model, the primary key property is the primary key for a row view of instance data. It is also used as a foreign key for some supporting views. For details, see Generating a TDE Template.
  • Unique identifier for auto-generated instance facts (triples). If you generate a TDE template from the model, the template enables generation of triples about each instance of an entity type that defines a primary key. For details, see Generating a TDE Template.
  • Value constraint on the primary key. If you generate query options from the model, the options pre-define a value constraint on the primary key. For details, see Generating Query Options for Searching Instances.

An entity type definition can contain at most one primary key. If you generate a schema from the model, the primary key entity property has its cardinality set to exactly 1; for details, see Generating an Entity Instance Schema.

To specify a primary key, include a primaryKey JSON property or primary-key XML element in the entity type definition. The value must be the name of an entity property defined in this type definition. The primary key entity property cannot have array type.

For example, the following definition of a Person entity defines the id entity property as a primary key:

Format Example
JSON
{ "info": { "title": "Example", "version": "1.0.0"},
  "definitions": {
    "Person": {
      "properties": {
        "id": { "datatype": "positiveInteger" },
        "name": { "datatype": "string" }
      },
      "primaryKey": "id"
}}}
XML
<es:model xmlns:es="http://marklogic.com/entity-services">
  <es:info>
    <es:title>Example</es:title>
    <es:version>1.0.0</es:version>
  </es:info>
  <es:definitions>
    <Person>
      <es:properties>
        <id><es:datatype>positiveInteger</es:datatype></id>
        <name><es:datatype>string</es:datatype></name>
      </es:properties>
    </Person>
  </es:definitions>
  <es:primary-key>id</es:primary-key>
</es:model>

Identifying Personally Identifiable Information (PII)

Security policies often require strict access controls for Personally Identifiable Information (PII), such as a telephone number, address, or social security number. Entity Services enables you to tag entity properties as containing PII, and subsequently generate special security configuration to control access to PII data in your entity instances. For more details, see Generating a PII Security Configuration Artifact.

The following example entity type definition flags the name and address entity properties as PII.

Format Example
JSON
{ "info": { "title": "Example", "version": "1.0.0"},
  "definitions": {
    "Person": {
      "description": "Example person entity type",
      "properties": {
        "id": { "datatype": "int" },
        "name": { "datatype": "string" },
        "address": { "datatype": "string" }
      },
      "pii" : ["name", "address"],
      "required": ["id", "name"]
    }
  }
}
XML
<es:model xmlns:es="http://marklogic.com/entity-services">
  <es:info>
    <es:title>Example</es:title>
    <es:version>1.0.0</es:version>
  </es:info>
  <es:definitions>
    <Person>
      <es:description>Example person entity type</es:description>
      <es:properties>
        <id><es:datatype>int</es:datatype></id>
        <name><es:datatype>string</es:datatype></name>
        <address><es:datatype>string</es:datatype></address>
      </es:properties>
      <es:pii>name</es:pii>
      <es:pii>address</es:pii>
      <es:required>id</es:required>
      <es:required>name</es:required>
    </Person>
  </es:definitions>
</es:model>

Distinguishing Required and Optional Entity Properties

By default, all entity properties defined in an entity type are optional. You can identify required properties by including their names in the required section of your entity type definition. The entity properties named in the required section must be defined in the containing entity type.

An entity property specified as a primary key is implicitly required, so you should not also include it in the explicit list of required properties.

When you validate an instance against the schema generated for an instance type, validation fails if the instance does not include at least one occurrence of a required entity property. Similarly, when you generate a TDE template for an instance type, required entity properties are not considered nullable.

The following example entity type definition defines 3 entity properties: id, name, and address. The id and name properties are required. The address entity property is optional.

Format Example
JSON
{ "info": { "title": "Example", "version": "1.0.0"},
  "definitions": {
    "Person": {
      "description": "Example person entity type",
      "properties": {
        "id": { "datatype": "int" },
        "name": { "datatype": "string" },
        "address": { "datatype": "string" }
      },
      "required": ["id", "name"]
    }
  }
}
XML
<es:model xmlns:es="http://marklogic.com/entity-services">
  <es:info>
    <es:title>Example</es:title>
    <es:version>1.0.0</es:version>
  </es:info>
  <es:definitions>
    <Person>
      <es:description>Example person entity type</es:description>
      <es:properties>
        <id><es:datatype>int</es:datatype></id>
        <name><es:datatype>string</es:datatype></name>
        <address><es:datatype>string</es:datatype></address>
      </es:properties>
      <es:required>id</es:required>
      <es:required>name</es:required>
    </Person>
  </es:definitions>
</es:model>

Defining a Namespace URI for an Entity Type

By default, the elements of an XML entity instance are in no namespace. If you include a namespace URI and prefix in your model, then your entity instances names will be qualified by the namespace, as long as you use an XML representation for your envelope documents.

Use of entity type namespaces is optional. If you choose to use a namespace, you must specify both a namespace URI and a prefix in your entity type definition.

In an XML model descriptor, use the following format to define a namespace URI and prefix:

<es:namespace>namespaceURI</es:namespace>
<es:namespace-prefix>prefix</es:namespace-prefix>

In a JSON model descriptor, use the following format to define a namespace URI and prefix:

"namespace": "namespaceURI",
"namespacePrefix": "prefix"

The following restrictions apply to defining namespace prefix binding. Any model that violates these restrictions will fail validation.

  • No namespace prefix can begin with xml, in any case combination. See https://www.w3.org/TR/REC-xml-names/.
  • The following namespace prefixes are reserved and must not be used: xsi, xs, xsd, es, json. In general, you should not use namespace prefixes pre-defined by MarkLogic, such xdmp.
  • The namespace XML element or JSON property value must be a valid absolute URI.
  • Entity type namespace prefixes must be unique across the model. You cannot define multiple entity types with the same namespace prefix.

If you define a namespace for an entity type, the Entity Services API uses it when creating XML envelope documents, extracting instances from XML envelopes, and generating model artifacts such as schemas, query options, and TDE templates.

The namespace is discarded when generating JSON envelope documents or extracting an instance from an envelope document as JSON. This means that generated code, query options, and TDE templates based on the model will include XPath expressions that will not match your JSON envelopes or instances without modification.

For example, the following model descriptor specifies that Person entities should be in the namespace http://example.org/es/gs and bind that namespace URI to the prefix esgs:

<es:model xmlns:es="http://marklogic.com/entity-services">
  <es:info>
    <es:title>Person</es:title>
    <es:version>1.1.0</es:version>
    <es:base-uri>http://example.org/example-person/</es:base-uri>
  </es:info>
  <es:definitions>
    <Person>
      <es:properties>
        <id><es:datatype>string</es:datatype></id>
        <firstName><es:datatype>string</es:datatype></firstName>
        <lastName><es:datatype>string</es:datatype></lastName>
        <fullName><es:datatype>string</es:datatype></fullName>
        <friends>
          <es:datatype>array</es:datatype>
          <es:items><es:ref>#/definitions/Person</es:ref></es:items>
        </friends>
      </es:properties>
      <es:namespace>http://example.org/es/gs</es:namespace>
      <es:namespace-prefix>esgs</es:namespace-prefix>
      <es:primary-key>id</es:primary-key>
      <es:required>firstName</es:required>
      <es:required>lastName</es:required>
      <es:required>fullName</es:required>
    </Person>
  </es:definitions>
</es:model>

The following table illustrates how the envelope documents change, based on whether or not the model defines an entity type namespace.

Use Case Example Envelope
No namespace in Person entity type definition
<es:envelope
    xmlns:es="http://marklogic.com/entity-services">
  <es:instance>
    <es:info>
      ...
    </es:info>
    <Person>
      <id>1234</id>
      <firstName>George</firstName>
      <lastName>Washington</lastName>
      <fullName>George Washington</fullName>
    </Person>
  </es:instance>
  <es:attachments>
    ...
  </es:attachments>
</es:envelope>
Person entity type definition defines namespace URI "http://example.org/es/gs" with prefix "esgs"
<es:envelope
    xmlns:es="http://marklogic.com/entity-services">
  <es:instance>
    <es:info>
      ...
    </es:info>
    <esgs:Person
        xmlns:esgs="http://example.org/es/gs">
      <esgs:id>1234</esgs:id>
      <esgs:firstName>George</esgs:firstName>
      <esgs:lastName>Washington</esgs:lastName>
      <esgs:fullName>George Washington</esgs:fullName>
    </esgs:Person>
  </es:instance>
  <es:attachments>
    ...
  </es:attachments>
</es:envelope>

If you call es:instance-xml-from-document or es.instanceXmlFromDocument on an XML envelope document for an entity type that uses namespaces, the returned instance includes the namespace.

For example, the following instance is extracted from the envelope document shown in the table above. Notice that it uses the esgs namespace.

<esgs:Person xmlns:es="http://marklogic.com/entity-services" 
             xmlns:esgs="http://example.org/es/gs">
  <esgs:id>1234</esgs:id>
  <esgs:firstName>George</esgs:firstName>
  <esgs:lastName>Washington</esgs:lastName>
  <esgs:fullName>George Washington</esgs:fullName>
</esgs:Person>

The namespace is not preserved when you use JSON envelopes or when you generate a JSON instance from an XML or JSON envelope.

Identifying Entity Properties for Indexing

Searchable entity properties should usually be backed by an index or lexicon.

A model descriptor can contain optional range index and word lexicon sections that indicate which entity properties should have an associated range index or word lexicon and search constraint definition. This specification affects generated artifacts such as query options and database configuration.

For more details, see the following topics:

Specifying Indexable Properties

A range index enables range queries over an entity property, such as match all inventory item instances with a price property greater than 5. Range indexes and word lexicons also enable search application features such as faceting and search term suggestions.

The Entity Services modeling language enables you to specify entity type properties that should be backed by an element range index, path range index, or word lexicon. (The element range index specification is applicable to both XML elements and JSON properties.)

To indicate that a property should be backed by a range index, include the following components in your model descriptor:

  • JSON: pathRangeIndex or elementRangeIndex
  • XML: es:path-range-index or es:element-range-index

In JSON, the value of pathRangeIndex and elementRangeIndex is an array of entity property names. In XML, define multiple path-range-index or element-range-index elements to tag multiple properties. For example:

JSON: "pathRangeIndex": ["price", "rating"]

XML: <es:path-range-index>price</es:path-range-index>
     <es:path-range-index>rating</es:path-range-index>

Note that an element range index is applicable to both XML elements and JSON properties, so your choice of index type is not limited by the representation of your entity instances. For details, see Creating Indexes and Lexicons Over JSON Documents in the Application Developer's Guide.

To specify properties to be backed by a word lexicon, include a wordLexicon JSON property or word-lexicon XML element in your model descriptor. In JSON, the value of wordLexicon is an array of property names. In XML, define multiple word-lexicon elements to tag multiple properties. The syntax is analogous to the range index example, above.

The properties named in a range index or word lexicon specification must be defined in the containing entity type definition and must conform to certain data type restrictions. For data type details, see Supported Datatypes.

For a complete example, see Example: Identifying Indexable Entity Properties.

Interaction with Generated Artifacts

Specifying the name of an entity property in the range index section has the following implications:

Specifying the name of an entity property in the word lexicon section has the following implications:

  • The database properties generated by the es:database-properties-generate XQuery function or the es.databasePropertiesGenerate JavaScript function will include word lexicon configuration for the named entity property.
  • The query options generated by the es:search-options-generate XQuery function or the es.searchOptionsGenerate JavaScript function will include a word constraint definition for the named entity property.

    If your model specifies a namespace binding for an entity type and you use JSON envelopes, the namespace is discarded in the JSON representation, but the generated index configuration still assumes a namespace, so the index configuration will not match your JSON data. You should usually use XML envelopes when you include a namespace specifier in your model.

For more details, see Generating a Database Configuration Artifact and Generating Query Options for Searching Instances.

Example: Identifying Indexable Entity Properties

The following example descriptors specify a path range index on the rating entity property and a word lexicon on the bio entity property of a Person entity type.

Format Model Descriptor Example
JSON
{ "info": { "title": "Example", "version": "1.0.0"},
  "definitions": {
    "Person": {
      "description": "Example person entity type",
      "properties": {
        "id": { "datatype": "int" },
        "name": { "datatype": "string" },
        "rating": { "datatype": "float" },
        "bio": { "datatype": "string" }
      },
      "pathRangeIndex": ["rating"],
      "wordLexicon": ["bio"]
    }
  }
}
XML
<es:model xmlns:es="http://marklogic.com/entity-services">
  <es:info>
    <es:title>Example</es:title>
    <es:version>1.0.0</es:version>
  </es:info>
  <es:definitions>
    <Person>
      <es:description>Example person entity type</es:description>
      <es:properties>
        <id><es:datatype>int</es:datatype></id>
        <name><es:datatype>string</es:datatype></name>
        <rating><es:datatype>float</es:datatype></rating>
        <bio><es:datatype>string</es:datatype></bio>
      </es:properties>
      <es:path-range-index>rating</es:path-range-index>
      <es:word-lexicon>bio</es:word-lexicon>
    </Person>
  </es:definitions>
</es:model>

If you generate database properties from the resulting model (using es:database-properties-generate or es.databasePropertiesGenerate), then the generated database configuration properties include the following details:

{ "database-name":"%%DATABASE%%",
  ...,
  "element-word-lexicon":[{
    "collation":"http://marklogic.com/collation/en",
    "localname":"bio", 
    "namespace-uri":""
  }], 
  "range-path-index":[{
    "collation":"http://marklogic.com/collation/en",
    "invalid-values":"reject",
    "path-expression":"//es:instance/Person/rating",
    "range-value-positions":false, 
    "scalar-type":"float"
  }],
  ...
}

If you generate query options from the resulting model (using es:search-options-generate or es.searchOptionsGenerate), then the generated options include the following constraint definitions:

<search:options
    xmlns:search="http://marklogic.com/appservices/search">
  ...
  <search:constraint name="rating">
    <search:range type="xs:float" facet="true">
      <search:path-index xmlns:es=...>
        //es:instance/Person/rating
      </search:path-index>
    </search:range>
  </search:constraint>
  <search:constraint name="bio">
    <search:word>
      <search:element ns="" name="bio"/>
    </search:word>
  </search:constraint>
  ...
</search:options>

For details on generating database properties and query options, see Generating Code and Other Artifacts. For details on using the generated artifacts, see Deploying Generated Code and Artifacts and Querying a Model or Entity Instances.

Supported Datatypes

Any property named in a range index specification must have a data type that can be used to define a range index or can be mapped to an indexable super type. You can define a property with any of the data types listed in property_type, but only scalar types can be used to define a range index. For example, you cannot specify a property that has type hexBinary, an array type, or a reference to another entity type.

For a list of type usable to define range indexes, see Understanding Range Indexes in the Administrator's Guide.

Any entity property specified in the word lexicon section of a model descriptor must have string type, or a type which normalizes to string, such as anyURI or iri.

Some datatypes are normalized to a supported index type for purposes of index configuration. For example, the positiveInteger, negativeInteger, and integer datatypes normalize to the XSD decimal type. The following mapping is used for purposes of index configuration:

  • byte, short become int
  • unsignedByte, unsignedShort become unsignedInt
  • all *integer types become decimal
  • iri, anyURI, boolean become string

Controlling the Model IRI and Module Namespaces

The info section of a model descriptor can include an optional base-uri XML element or baseUri JSON property. If a base URI is defined, it is used for the following purposes:

  • When you use Entity Services to generate code modules such as an instance converter, the module namespace uses the base URI as the beginning of the module namespace URI.
  • When you generate a model from the descriptor, the base URI is used as the beginning of the model IRI when generating facts about the model as RDF triples.

If you do not include a base URI definition in your descriptor, Entity Services uses http://example.org/.

For example, the following descriptor defines a base URI of http://my/org/.

Format Model Descriptor Example
JSON
{ "info": { 
    "title": "Example", 
    "version": "1.0.0"
    "baseUri": "http://my/org/"
  },
  "definitions": {
    "Person": { ... }
  }
}
XML
<es:model xmlns:es="http://marklogic.com/entity-services">
  <es:info>
    <es:title>Example</es:title>
    <es:version>1.0.0</es:version>
    <es:base-uri>http://my/org/</es:base-uri>
  </es:info>
  <es:definitions>
    <Person>...</Person>
  </es:definitions>
</es:model>

If you generate an instance converter module from this descriptor, then the module namespace is created by appending the module title (Example) and version (1.0.0) to the base URI (http://my/org/), as follows:

module namespace example = "http://my/org/Example-1.0.0";

If you did not define a base URI, then the module namespace URI would be http://example.org/Example-1.0.0. For more details on the generated module namespace, see Module Namespace Declaration.

Similarly, when you create a model from the above example descriptor, the base URI is used as an IRI prefix for the generated model and instance triples. For example, the Person entity type defined by the example has the following IRI:

http://my/org/Example-1.0.0/Person

If you do not define a base URI, then the above IRI would be http://example.org/Example-1.0.0/Person.

The base URI is always combined with other model metadata, such as the model title and version.

Defining Entity Relationships

You can model relationships between entity types by referencing an entity type in place of a datatype in the definition of an entity property. This is the $ref JSON property or es:ref XML element of the property definition.

References can either be local (identifying a type defined in the same descriptor) or external (identifying a type that cannot be locally resolved by the Entity Services API).

Defining a Local Entity Reference

A local entity reference refers to an entity type defined in the current model. A local reference is defined by a relative URI of the following form:

#/definitions/entityTypeName

A local entity reference is resolvable during code generation, such as when you call the es:instance-converter-generate XQuery function or the es.instanceConverterGenerate JavaScript function. This resolvability enables the Entity Services code generation tools to, for example, embed the properties of a local reference inside an instance of the referencing type.

For example, the following model descriptor defines two entity types, Person and Name. The Person entity type definition includes a name entity property that is a reference to the Name entity type. The type of the name property is a local reference.

Format Example
JSON
{ "info": { "title": "Example", "version": "1.0.0"},
  "definitions": {
    "Name": {
      "description": "The name of a person.",
      "properties": {
        "first": { "datatype": "string" },
        "middle": { "datatype": "string" },
        "last": { "datatype": "string" }
      },
      "required": ["first", "last"]
    },
    "Person": {
      "description": "Example person entity type",
      "properties": {
        "id": { "datatype": "int" },
        "name": { "$ref": "#/definitions/Name" },
      }
}}}
XML
<es:model xmlns:es="http://marklogic.com/entity-services">
  <es:info>
    <es:title>Example</es:title>
    <es:version>1.0.0</es:version>
  </es:info>
  <es:definitions>
    <Name>
      <es:description>The name of a person.</es:description>
      <es:properties>
        <first><es:datatype>string</es:datatype></first>
        <middle><es:datatype>string</es:datatype></middle>
        <last><es:datatype>string</es:datatype></last>
      </es:properties>
      <es:required>first</es:required>
      <es:required>last</es:required>
    </Name>
    <Person>
      <es:description>Example person entity type</es:description>
      <es:properties>
        <id><es:datatype>int</es:datatype></id>
        <name><es:ref>#/definitions/Name</es:ref></name>
      </es:properties>
    </Person>
  </es:definitions>
</es:model>

If you generate an instance converter from this model, the default code template assumes that a Person entity instance has a Name entity instance embedded within it. For example, a Person entity instance generated by es:instance-json-from-document or es.instanceJsonFromDocument might look like the following:

{ "Person": {
    "id": 1234,
    "name": {
      "first": "John",
      "middle": "NMI",
      "last": "Smith"
    }
} }

You could also choose to have the Name persisted separately and reference it from a Person entity via a primary key, URI, or other identifier. That is a choice you make when customizing your instance converter. For more details, see Creating an Instance Converter Module.

Defining an External Entity Reference

An external entity reference refers to an entity type defined outside the model. The referenced type is identified by an IRI. The referenced type should be defined elsewhere in MarkLogic. Resolution of the reference is handled by MarkLogic's SPARQL engine.

No validation is performed on the value of an external reference. When you use the Entity Services APIs to generate code and other artifacts, the reference is treated as an opaque string.

For example, the following model descriptor defines a Person entity type that contains a name property that is an external reference to a type identified by http://example.org/Name. This could be an entity type defined by a different Entity Services model.

Format Example
JSON
{ "info": { "title": "Example", "version": "1.0.0"},
  "definitions": {
    "Person": {
      "description": "Example person entity type",
      "properties": {
        "id": { "datatype": "int" },
        "name": { "$ref": "http://example.org/Name" },
      }
}}}
XML
<es:model xmlns:es="http://marklogic.com/entity-services">
  <es:info>
    <es:title>Example</es:title>
    <es:version>1.0.0</es:version>
  </es:info>
  <es:definitions>
    <Person>
      <es:description>Example person entity type</es:description>
      <es:properties>
        <id><es:datatype>int</es:datatype></id>
        <name><es:ref>http://example.org/Name</es:ref></name>
      </es:properties>
    </Person>
  </es:definitions>
</es:model>

You would customize your Person instance converter code to fill in the value of the name property with an appropriate reference or embedded value. Since the shape of the external entity type is not defined by the model, the Entity Services code generation tools cannot assume an embedded object as they can for local references. To learn more about instance generation, see Creating an Instance Converter Module.

Creating a Model from a Model Descriptor

Create a model from a JSON or XML descriptor by inserting the descriptor document into the database as part of the special Entity Services collection http://marklogic.com/entity-services/models.

During insertion, MarkLogic generates a model from the descriptor. The model includes the entity type definitions, properties, and relationships defined by your descriptor, plus facts about the model that MarkLogic automatically infers from the descriptor. These facts are expressed as Semantic triples; for details, see Search Basics for Models. You can also add your own facts; for details, see Extending a Model with Additional Facts.

For example, the following code snippet creates a model from a descriptor. For a more complete example see Getting Started With Entity Services.

Language Example
XQuery
xquery version "1.0-ml";
import module namespace es = "http://marklogic.com/entity-services"
  at "/MarkLogic/entity-services/entity-services.xqy";

let $desc := ... (: your model descriptor :)
return xdmp:document-insert(
  '/es-gs/models/person-1.0.0.json', $desc,
  <options xmlns="xdmp:document-insert">  
    <collections>{
      <collection>http://marklogic.com/entity-services/models</collection>,
      for $coll in xdmp:default-collections()
      return <collection>{$coll}</collection>
    }</collections>
  </options>
)
JavaScript
'use strict';
declareUpdate();
const es = require('/MarkLogic/entity-services/entity-services.xqy');

const desc = ...; // your model descriptor
xdmp.documentInsert(
  '/es-gs/models/person-1.0.0.json', desc,
  {collections: ['http://marklogic.com/entity-services/models']}
);

Note that if you create a model with an XML descriptor, then you will have to convert the persisted document to its in-memory JSON representation before you can use it with any Entity Services functions that expect a model as input. For details, see Working With an XML Model Descriptor.

Working With an XML Model Descriptor

The natural representation of a model descriptor in the Entity Services API is a JSON object node. In XQuery, the in-memory JSON representation of a model descriptor is as a json:object (a special kind of map:map). The equivalent representation in Server-Side JavaScript is a JSON object node or JavaScript object. (MarkLogic implicitly converts JavaScript objects to JSON objects when you pass them as parameters.)

If you create a model by persisting an XML descriptor, you must convert the persisted descriptor into its JSON representation before you can pass it to most Entity Services functions. You can create a JSON object from an XML descriptor using the following functions:

To learn more about descriptor validation, see Validating a Model Descriptor.

The following example code snippet generates an instance converter module from an XML descriptor by first converting the descriptor to JSON. Assume /es-gs/models/person-1.0.0.xml is previously persisted descriptor used to create a model.

Language Example
XQuery
xquery version "1.0-ml";
import module namespace es = "http://marklogic.com/entity-services"
  at "/MarkLogic/entity-services/entity-services.xqy";

let $desc := fn:doc('/es-gs/models/person-1.0.0.xml')
return es:instance-converter-generate(
  es:model-from-xml($desc))
JavaScript
'use strict';
const es = require('/MarkLogic/entity-services/entity-services.xqy');

const desc = cts.doc('/es-gs/models/person-1.0.0.xml');
es.instanceConverterGenerate(es.modelFromXml(desc));

If you persist your XML descriptor as JSON instead of XML, then you only need to do the conversion once, at model creation time. This is the technique used in Create a Model.

In XQuery, you can manipulate the JSON representation of the descriptor as a map:map; for details, see Building a JSON Object from a Map.

Validating a Model Descriptor

To validate a model descriptor, use the es:model-validate XQuery function or the es.modelValidate Server-Side JavaScript function.

If the input descriptor is valid, this function returns a valid JSON descriptor that can be persisted in the database or used as input to any Entity Services interfaces that accepts a model as input. If the input descriptor is invalid, this function throws an ES-MODEL-INVALID exception and reports the validation failures in the error details.

Since an invalid model descriptor produces an invalid model, you should use model validation during development. Model validation does introduce added overhead, however, so you might choose to skip it when going between a descriptor and a model in production situations.

The following example validates a simple model descriptor containing a Person entity type definition. The model descriptor is valid, so no exception is raised, and the returned model is identical to the JSON model descriptor used in the JavaScript example.

Language Example
XQuery
xquery version "1.0-ml";
import module namespace es = "http://marklogic.com/entity-services"
  at "/MarkLogic/entity-services/entity-services.xqy";
es:model-validate(
<es:model xmlns:es="http://marklogic.com/entity-services">
  <es:info>
    <es:title>Example</es:title>
    <es:version>1.0.0</es:version>
  </es:info>
  <es:definitions>
    <Person>
      <es:properties>
        <id><es:datatype>int</es:datatype></id>
        <name><es:datatype>string</es:datatype></name>
      </es:properties>
      <es:required>id</es:required>
      <es:required>name</es:required>
    </Person>
  </es:definitions>
</es:model>
)
JavaScript
var es = require('/MarkLogic/entity-services/entity-services');
es.modelValidate(
  { "info": { "title": "Example", "version": "1.0.0" },
    "definitions": {
      "Person": {
        "properties": {
          "id": { "datatype": "int" },
          "name": { "datatype": "string" },
        },
        "required": ["id", "name"]
  } } }
);

If we introduce an error by specifying that an undefined entity property named UNDEF is a required property, then validation raises an error similar to the following:

ES-MODEL-INVALID (err:FOER0000): "Required" property UNDEF doesn't exist.

Extending a Model with Additional Facts

You can extend your model with information and relationships that cannot be expressed in or derived from the model descriptor by storing additional semantic triples related to your model in MarkLogic.

You can use the model, entity type, and property IRIs generated by Entity Services to express these new facts. Entity Services uses the following patterns for constructing IRIs when generating RDF triple data about a model:

  • model IRI: baseUri/modelTitle-modelVersion
  • entity type IRI: modelIri/typeName
  • entity property IRI: entityTypeIri/propertyName

For example, suppose you have the following model descriptor:

{ "info": { 
    "title": "People", 
    "version": "1.0.0",
    "baseUri": "http://marklogic.com/example/"
  },
  "definitions": {
    "Person": {
      "properties": {
        "id": { "datatype": "int" },
        "name": { "datatype": "string" },
      }
} } }

Then the following IRIs are generated and used by Entity Services:

  • People model IRI: http://marklogic.com/example/People-1.0.0
  • Person entity type IRI: http://marklogic.com/example/People-1.0.0/Person
  • Person property name IRI: http://marklogic.com/example/People-1.0.0/Person/name

You can use any of MarkLogic's Semantic capabilities to add, manage, and query triples you add to your model, including embedding triples in your entity instance envelope documents and customizing the TDE template you can generate with Entity Services. You can also use the model IRI as named graph IRI for integrating separate triples-based modeling with an Entity Services model.

For more information about using Semantics with MarkLogic, see Semantics Developer's Guide.

Managing Model Changes

Some kinds of changes do not affect the structure and content of your instances. For example, if you decide to index a property that was not previously indexed or change a property from required to optional, your instances will not change.

However, changes such as the following typically impact the content in your instances, application code, and generated artifacts:

  • add or remove a property
  • change the data type of a property
  • make an optional property required
  • add or remove an entity type

Entity Services can help you update your application as your model evolves.

When integrating model changes, you must decide if all consumers of your instance data will move to the new model at the same time, or if you need to support both old and new models during some transition period. You must also choose how to generate instances based on your new model version.

See the following topics for more details:

For an end to end example of updating a model version, see example-versions in the Entity Services examples on GitHub. For more details, see Exploring the Entity Services Open-Source Examples.

Generating Instances From the New Model

You can upgrade your instance data using one of the following strategies:

  • Re-extract instances from original source using an instance converter generated from the new model.
  • Convert old version instances into new using a version translator.

What you do with the instance data based on the new model depends on your version transition strategy. For details, see Replacing the Old Model with a New Version and Making Multiple Model Versions Available.

You should use a version translator if re-extraction is not practical. A version translator is also useful for creating in-memory instances of a different version to return to downstream consumers. For example, if you've advanced your content to v2 of your model, you could use a v2-to-v1 translator to synthesize v1 instances for v1 clients.

Both the instance converter and the version translator can be generated using the Entity Services API.

To re-extract instances from original source, generate, customize, and install an instance converter based on the new model, as described in Creating an Instance Converter Module. Send your raw source data through the converter, just as you did with the previous model version.

To use a version translator to generate new version instances from old ones, generate, customize, and install a version translator module from the old and new models as described in Creating a Model Version Translator Module. Then, use the translator to convert instance data from the old model to the new one.

The following diagram illustrates using a version translator to generate an envelope document containing an instance based on a new model version. You can also pass just an instance (rather than an envelope document) to the translator.

Replacing the Old Model with a New Version

If all consumers will immediately move to the new model then you can do the following to update your model-based artifacts:

  • TDE template, query options, schema artifacts:
    • Generate a version based on the new model.
    • Apply your customizations, including merging in appropriate customizations from the old model.
    • Redeploy the artifacts.
  • Database configuration: If the new model adds or removes range indexes and word lexicons, you will need to generate a new configuration artifact, apply your customizations, and update your database configuration.
  • Instance converter:
    • Generate a converter based on the new model.
    • Apply your customizations, including merging in appropriate customizations from the old model.
    • Redeploy the module.
  • Instance data:

Note that you might still be able to serve old version instances to clients by using a down-converting version translator to convert new instances to old ones during extraction. You can generate such a translator using Entity Services; for details, see Creating a Model Version Translator Module.

Making Multiple Model Versions Available

When maintaining multiple model versions, the procedures are similar to those described in Replacing the Old Model with a New Version, but you must consider how to manage multiple versions of your code and configuration artifacts, such as the following:

Instance Data

You must choose an approach to storing your updated instance data in the database. You might use one of the following approaches to managing versions:

  • Each envelope document contains either an old OR a new version of an instance.
  • Each envelope document contains both an old AND a new version of an instance.

In the first approach, the database contains envelope documents for instances based on both model versions, as shown in the following diagram:

In this case, putting the envelope documents in different collections based on version will make them easier to manage and search. You can also use the value of es:instance/es:info/es:version to distinguish between versions.

In the second approach, the database still contains only one set of envelope documents, but each envelope contains multiple instances, as shown in the following diagram:

You can use the value of es:instance/es:info/es:version to distinguish between versions during search and entity extraction. Your instance converter must be customized to store multiple instances in a single envelope.

Entity Type Schema

This topic refers to maintaining more than one version of the schemas generated by the es:schema-generate XQuery function or the es.schemaGenerate Server-Side JavaScript function.

It is usually best to avoid multiple schemas for the same type name. Schema validation is based on type name, so if you do not explicitly specify which schema to use for validation you won't know which schema is applied.

During explicit validation in XQuery, you can import a schema into your evaluation context. For example, if you have v1.0.0 and v2.0.0 schemas installed for a model that defines a Person entity type, then you could force validation against the v2.0.0 model by doing the following:

xquery version "1.0-ml";
import module namespace es = "http://marklogic.com/entity-services"
  at "/MarkLogic/entity-services/entity-services.xqy";
import schema default element namespace ""
  at "/es-gs/person-2.0.0.xsd";

xdmp:validate(
  es:instance-xml-from-document(
    fn:doc('/es-gs/envelopes/1234.xml')),
  'type', xs:QName('PersonType'))

For XML instance representations, you can add @schemaLocation to control which schema is applied. For more details, see Referencing Your Schema in the Application Developer's Guide.

TDE Template

The triples generated from a TDE template generated by Entity Services use a subject IRI that includes the model version. Therefore, there is no collision between the facts generated from each template version.

However, both templates will use the same row schema-name for the same entity types, which will cause row searches to return the union of matched by both templates. To avoid this, you should give each entity type row schema a unique name.

Query Options

You can merge old and new version query options together, or keep them separate and use the version appropriate for entity instance versions you're searching.

If you choose to keep multiple versions of canonical instances in a single envelope document, you should probably modify your query options to include version related constraints or additional queries.

For example, you might want to add a version constraint based on es:envelope/es:instance/es:info/es:version.

Database Configuration

The database configuration is single-state. You can configure the union of range indexes and word lexicons defined by the two models.

You should usually not remove a range index or word lexicon required by the older model if you wish to continue supporting searches on that version. Also, if you define a range index or word lexicon for a property that exists in both model versions, you might see different search results against the old version entities because queries against the shared property can now be resolved out of the index.

Model Descriptor Syntax Reference

This section provides a detailed description of the layout of a model descriptor, including syntax, component descriptions, and examples. A model descriptor has the following top level structure, where the info section contains model metadata, and the definitions section contains your entity type definitions. A model descriptor must define at least one entity type.

JSON XML
{
  "info": model_info,
  "definitions": {
    entity_type_definition,
    ...
  }
}
<es:model xmlns:es="http://marklogic.com/entity-services">
  <es:info>model_info</es:info>
  <es:definitions>
    entity_type_definition 
    ...
  <es:definitions>
</es:model>

To explore the component parts of a model descriptor in more detail, see the following topics:

model_info

The info section of a model descriptor contains model metadata, such as a description or version.

Syntax Summary

The info section of a model descriptor has the following structure:

JSON XML
{
  "title": string,
  "version": string,
  "baseUri": string,
  "description": string
}
<es:info xmlns:es="http://marklogic.com/entity-services">
  <es:title>model title</es:title>
  <es:version>model version</es:version>
  <es:base-uri>absolute uri</es:base-uri>
  <es:description>model desc</es:description>
</es:info>
Component Description

The info section of a model descriptor can contain the following XML elements or JSON properties. Title and version are the only required items.

Property Name Description
title

Required. The title of this model descriptor. The title string must be a valid XQuery namespace prefix.

If you plan to generate a TDE template from the model, you should avoid using hyphens (-) in the title. Hyphens will be converted to underscores (_) in the TDE schema, view, and column names, in order to avoid invalid SQL names.

The title is used as the module namespace prefix when generating code from the model. If the first character of the title is upper case, it will be converted to lower space when used as namespace prefix.

version
Required. The version of this model descriptor. Best practice is to use the semver format, such as 1.0.0; for details, see http://semver.org/. The version number of the model is considered the version number of all the entity types defined within the model.
baseUri (JSON)
base-uri (XML)
Optional. A valid absolute URI, usable to resolve RDF values in the descriptor. If this entity property is not present, http://example.org/ is used as the default URI. For details, see Controlling the Model IRI and Module Namespaces.
description
Optional. A description of this set of entity type definitions. This is purely information metadata.
Examples

The following example contains an info section that uses all available properties. Only the title and version properties are required.

Format Example Model Descriptor
JSON
{ "info": { 
    "title": "Example", 
    "description": "ES Examples",
    "version": "1.0.0",
    "baseUri": "http://es-ex/examples",
  },
  "definitions": {
  "Person": {
    "properties": {
      "id": { "datatype": "int" },
      "name": { "datatype": "string" }
} } } }
XML
<es:model xmlns:es="http://marklogic.com/entity-services">
  <es:info>
    <es:title>Example</es:title>
    <es:description>ES Examples</es:description>
    <es:version>1.0.0</es:version>
    <es:base-uri>http://es-ex/examples</es:base-uri>
  </es:info>
  <es:definitions>
    <Person>
      <es:properties>
        <id><es:datatype>int</es:datatype></id>
        <name><es:datatype>string</es:datatype></name>
      </es:properties>
    </Person>
  </es:definitions>
</es:model>

entity_type_definition

An entity type definition is a child of the definitions section of a model descriptor. A model descriptor must include at least one entity type definition, and may contain multiple entity type definitions.

Syntax Summary

An entity type definition has the following structure, where entityTypeName (in JSON) and entity-type-name (in XML) represent the user-defined entity type name, such as Person or Order. By convention, entity type names begin with a capital letter (Person, not person).

If you plan to generate a TDE template from the model, you should avoid using hyphens (-) in the entity type and entity property names. Hyphens will be converted to underscores (_) in the TDE schema, view, and column names, in order to avoid invalid SQL names.

JSON XML
entityTypeName : {
  "properties": {
    propertyName: property_definition,
    ...
  },
  "required": [ string ],
  "primaryKey": string,
  "namespace": string,
  "namespacePrefix": string,
  "pii": [ string ], 
  "pathRangeIndex": [ string ],
  "elementRangeIndex": [ string ],
  "rangeIndex": [ string ],
  "wordLexicon": [ string ]
  "description": string
 
}
<entity-type-name
  xmlns:es="http://marklogic.com/entity-services">
  <es:properties>
    <property-name>
      property_definition 
    </property-name>
    ...
  </es:properties>
  <es:required>property name</es:required>
  <es:primary-key>
    property name
  </es:primary-key>
  <es:namespace>namespace URI</es:namespace>
  <es:namespace-prefix>
    namespace prefix
  </es:namespace-prefix>
  <es:pii>property name</es:pii>
  <es:path-range-index>
    property name
  </es:path-range-index>
  <es:element-range-index>
    property name
  </es:element-range-index>
  <es:range-index>
    property name
  </es:range-index>
  <es:word-lexicon>
    property name
  </es:word-lexicon>
  <es:description>type desc</es:description>
</entity-type-name>
Component Description

An entity type definition can contain the following XML elements or JSON properties.

Property Name Description
properties
Optional. Zero or more entity property definitions. Each child JSON property or XML element name is the name of a property of the entity type. In XML, the element name must not be namespace qualified. For more details, see Writing a Model Descriptor.
description
Optional. A description of this entity type.
required
Optional. Specify the names of entity properties that must be in every instance of this entity type. In XML, include multiple required elements to specify multiple required property names. Each named entity property must match the name of an entity property defined in the properties section of this entity type definition. Any entity properties not tagged as required are treated as optional. For more details, see Distinguishing Required and Optional Entity Properties.
primaryKey (JSON)
primary-key (XML)
Optional. The name of an entity property to use as a primary key when generating artifacts such as an extraction template. The value must match the name of an entity property defined in the properties section of this entity type definition. There can be at most one primary key in an entity type definition. The primary key property is implicitly also a required property. For more details, see Identifying the Primary Key Entity Property.
namespace
Optional. A namespace URI with which to qualify canonical XML entity instances of this type. If you include a namespace URI, you must also define a namespace prefix using the namespace-prefix XML element or namespacePrefix JSON property. The namespace is also used in generated database configuration and query options artifacts. For details and restrictions, see Defining a Namespace URI for an Entity Type.
namespacePrefix (JSON)
namespace-prefix (XML)
Optional. A namespace prefix to bind to the XML namespace defined by the namespace XML element or JSON property. You must define a prefix if you define a namespace. For details and restrictions, see Defining a Namespace URI for an Entity Type.
pii
Optional. The name(s) of entity properties that can contain Personally Identifiable Information (PII). You can generate an Element Level Security (ELS) configuration to more tightly restrict access to PII properties than access to other instance properties. For details, see Identifying Personally Identifiable Information (PII). In XML, include multiple pii elements to specify multiple properties.
pathRangeIndex (JSON)
path-range-index (XML)
Optional. The name(s) of entity properties that should be backed by a path range index. This affects the database configuration and query options you can generate from a model. Each named property must match the name of an entity property defined in the properties section of this entity type definition. In XML, include multiple path-range-index elements to specify multiple properties. For more details, see Identifying Entity Properties for Indexing.
elementRangeIndex (JSON)
element-range-index (XML)
Optional. The name(s) of entity properties that should be backed by an element range index. This affects the database configuration and query options you can generate from a model. Each named property must match the name of an entity property defined in the properties section of this entity type definition. In XML, include multiple element-range-index elements to specify multiple properties. For more details, see Identifying Entity Properties for Indexing.
rangeIndex (JSON)
range-index (XML)
Optional. Deprecated. Equivalent to the pathRangeIndex property in a JSON descriptor, or the path-range-index element in an XML descriptor.
wordLexicon (JSON)
word-lexicon (XML)
Optional. The name(s) of entity properties that should be backed by a word lexicon. This affects the database configuration and query options you can generate from a model. Each named property must match the name of an entity property defined in the properties section of this entity type definition. In XML, include multiple word-lexicon elements to specify multiple properties. For details, see Identifying Entity Properties for Indexing.
Examples

The following example defines a Person entity type that contains entity properties named id, name, bio, and rating. The id and name properties are required. The id entity property is a primary key. A path range index is required for id and rating, and a word lexicon is required for bio. The name property is tagged as PII.

Format Example Model Descriptor
JSON
{ "info": { 
    "title": "Example", 
    "description": "ES Examples",
    "version": "1.0.0"
  },
  "definitions": {
    "Person": {
      "properties": {
        "id": { "datatype": "int" },
        "name": { "datatype": "string" },
        "bio": { "datatype": "string" },
        "rating": { "datatype": "float" }
      },
      "required": ["id", "name"],
      "primaryKey": "id",
      "pii": ["name"],
      "pathRangeIndex": ["id", "rating"],
      "wordLexicon": ["bio"],
      "namespace": "http://example.org/es/gs",
      "namespacePrefix": "es"
}}}
XML
<es:model xmlns:es="http://marklogic.com/entity-services">
  <es:info>
    <es:title>Example</es:title>
    <es:description>ES Examples</es:description>
    <es:version>1.0.0</es:version>
  </es:info>
  <es:definitions>
    <Person>
      <es:properties>
        <id><es:datatype>int</es:datatype></id>
        <name><es:datatype>string</es:datatype></name>
        <bio><es:datatype>string</es:datatype></bio>
        <rating><es:datatype>float</es:datatype></rating>
      </es:properties>
      <es:required>id</es:required>
      <es:required>name</es:required>
      <es:primary-key>id</es:primary-key>
      <es:pii>name</es:pii>
      <es:path-range-index>id</es:path-range-index>
      <es:path-range-index>rating</es:path-range-index>
      <es:word-lexicon>bio</es:word-lexicon>
      <es:namespace>http://example.org/es/gs</es:namespace>
      <es:namespace-prefix>esgs</es:namespace-prefix>
    </Person>
  </es:definitions>
</es:model>
See Also

For more details about using this component, see the following topics:

property_definition

An entity property definition is a child of the entity_type_definition section of a model descriptor. Each entity type must include at least one entity property definition.

Syntax Summary

An entity property definition can have one of the following forms. Entity property definition are used in the properties child of an entity_type_definition.

JSON XML
{
  "datatype" : "string",
  "collation": string,
  "description": string
}

{
  "datatype" : "array",
  "items": property_definition ,
  "description": string
}

{
  "datatype" : property_type,
  "description": string
}

{
  "$ref": string,
  "description": string
}
<!-- string-valued entity property -->
<es:datatype>string</es:datatype>
<es:collation>
  collationUri
</es:collation>
<es:description>desc</es:description>

<!-- array/list-valued property -->
<es:datatype>array</es:datatype>
<es:items>property_definition</es:items>

<!-- prop of any other type -->
<es:datatype>property_type</es:datatype>
<es:description>desc</es:description>

<!-- ref to another entity type -->
<es:ref>type path ref</es:ref>
<es:description>desc</es:description>
Component Description

This portion of a model descriptor can contain the following XML elements or JSON properties. An entity property definition must include either a datatype or ref JSON property or XML element, but not both.

Property Name Description
datatype
Required if $ref (JSON) or es:ref (XML) is not present. The data type of values in this entity property. The value must be one of the types listed in property_type. The datatype can affect what other JSON properties or XML elements can be included in this definition, such as a datatype of string enabling the inclusion of a collation URI in the property definition.
$ref (JSON
ref (XML)
Required if datatype is not present. A reference to another entity type, in the form of either a relative path to an entity type defined in this model descriptor or an absolute IRI. The value must end in a simple type name so that it can be treated as a type name during code generation. For details, see Defining Entity Relationships and Defining an Entity Property with a Complex Type.
collation
Optional. Only usable when the value of datatype is string. The collation to use when generating index/lexicon configuration and query options. If you do not specify a collation, then it defaults to http://marklogic.com/collation/en.
items
Required when the value of datatype is array. The type definition for the array items. The value is itself an entity_type_definition. For details, see Defining an Entity Property with Array Type.
description
Optional. A description of this entity type.
Examples

The following example defines a Person entity type with 3 entity properties: An id of type int, a name with type string whose definition includes a collation, and a friend entity property with array type. Each item value in the friend array is a reference to a Person entity.

Format Example Model Descriptor
JSON
{ "info": { 
    "title": "Example", 
    "description": "ES Examples",
    "version": "1.0.0"
  },
  "definitions": {
    "Person": {
      "properties": {
        "id": { "datatype": "int" },
        "name": { 
          "datatype": "string", 
          "collation": "http://marklogic.com/collation/"
        },
        "friend": {
          "datatype" : "array",
          "items": { "$ref" : "#/definitions/Person" }
      }
}}}}
XML
<es:model xmlns:es="http://marklogic.com/entity-services">
  <es:info>
    <es:title>Example</es:title>
    <es:description>ES Examples</es:description>
    <es:version>1.0.0</es:version>
  </es:info>
  <es:definitions>
    <Person>
      <es:properties>
        <id><es:datatype>int</es:datatype></id>
        <name>
          <es:datatype>string</es:datatype>
          <es:collation>http://marklogic.com/collation/</es:collation>
        </name>
        <friend>
          <es:datatype>array</es:datatype>
          <es:items>
            <es:ref>#/definitions/Person</es:ref>
          </es:items>
        </friend>
      </es:properties>
    </Person>
  </es:definitions>
</es:model>
See Also

For more details, see the following topics:

property_type

This section defines the type names that can be specified in the datatype JSON property or XML element of an entity property definition. With the exception of iri and array, these types correspond to XML Schema Definition Language (XSD) of the same name; for details, see https://www.w3.org/TR/xmlschema11-2/#built-in-datatypes.

iri
duration
negativeInteger
array
float
nonNegativeInteger
anyURI
gDay
nonPositiveInteger
base64Binary
gMonth
short
boolean
gMonthDay
string
byte
gYear
time
date
gYearMonth
unsignedByte
dateTime
hexBinary
unsignedInt
dayTimeDuration
int
unsignedLong
decimal
integer
unsignedShort
double
long
yearMonthDuration

Not all these datatypes are usable as range index or word lexicon types. If you specify an entity property with an incompatible type in the range index or word lexicon specification of an entity type definition, then the resulting model will not validate.

An array-typed entity property contains an item type definition that also uses this type list. For details, see property_definition and Defining an Entity Property with Array Type.

Some types are folded into a compatible super-type when defining range indexes. For example, entity properties of type iri are indexed as string, and entity properties of type byte or short are indexed as int. Some data type cannot be used for index or word lexicon configuration.

For more details, see the following topics:

« Previous chapter
Next chapter »