This chapter covers entity model description management tasks. A model descriptor defines entity types, their properties, and relationships between entities. The following topics are covered:
A fully constructed model consists of a set of facts about the modeled entity types, their properties, and the relationships between them. The facts are represented in MarkLogic as semantic triples.
The entity types, properties, and relationships are defined by an XML or JSON model descriptor. When you persist the descriptor in the database in the prescribed way, MarkLogic automatically creates the model by generating facts about the model, expressed as triples. You can also add your own facts (triples) to the model.
The following diagram depicts the building blocks of an entity model in MarkLogic:
Building a model involves the following steps:
The following diagram is a pictoral representation of this process.
Once you have a valid descriptor or model, you can use the Entity Services API to generate code and other artifacts that provide a foundation for creating an application based on your model. You can use the API to create the following:
For more details, see Generating Code and Other Artifacts.
This section describes how to define a model descriptor containing entity type definitions and model metadata. This section includes the following topics:
A model descriptor is an XML element or JSON object that defines one or more entity types, model metadata, and relationships between entity types. You can generate code templates and configuration artifacts from the descriptor in the form of either a JSON object-node
or a json:object (a special kind of map:map).
A model descriptor has two parts: The info
section contains model metadata, such as a title and version; the definitions
section contains entity type definitions, including entity properties and relationships, plus type-specific metadata.
A descriptor must define at least one enity type and can define multiple entity types. Each type definition can include additional metadata to guide code and artifact generation. For details, see Entity Type Definition Overview.
The entity type property names in your model should be unique, even across entity types to avoid name collisions in generated code and artifacts.
The natural representation for a model descriptor is JSON because it already matches the internal representation of the model. When you use an XML model descriptor, you must call one of the following functions to translate your descriptor into a form usable with Entity Services functions that accept a model as input.
For more details, see Working With an XML Model Descriptor.
You might find it useful to generate test entity instances during model development so you can see a concrete example of the default entities produced by your model. For details, see Generating Test Entity Instances.
The following is an example of the simplest possible model descriptor. The descriptor must contain at least title
and version
metadata in the info
section, and define at least one entity type with at least one property in the definitions
section. In this example, the model named Example defines an entity type named Person. A Person
entity has an id
property.
An entity type definition usually includes one or more entity property definitions and can include the type metadata such as a primary key specification. This section provides a brief overview of defining an entity type. For syntax details, see entity_type_definition
All property definitions must include either a data type or a reference to another entity type. The data type of a property can be string, array, iri, or one of several XSD types. Depending on the data type, a property definition may require additional information. For details, see Writing a Model Descriptor and property_definition.
The data type of an entity property can be any of the following:
Depending on the type, the property definition can include additional information. For example when the datatype is string, you can specify a collation. For syntax details, see property_definition.
An entity type definition can include the following type-specific metadata that is used when generating code and configuration artifacts:
Property names should be unique across all the entity types in a model. Duplicate property names can lead to name collisions in generated code and artifacts, causing some code and configuration to be generated commented out.
For example, the following model descriptor defines a Person entity with two required entity properties (id and name) and two optional entity properties (address and rating). The id property is the primary key. In addition, the descriptor specifies that a path range index configuration and query options should be generated for the rating property.
To define an entity type property with a simple type such as string, integer, or date, specify the type name as the value of the datatype
JSON property or XML element. For a complete list of supported type names, see property_type.
Not all the supported data types are usable as range index or word lexicon types. If you specify an entity property with an incompatible type in the range index or word lexicon specification of an entity type definition, then the resulting model will not validate.
For example, the following entity type definition contains entity properties with four different simple types.
If the type name is string, then you can optionally include a collation URI to be used when generating index, lexicon, and query option configuration artifacts from the model. If you omit the collation for a string-typed entity property, the collation defaults to http://marklogic.com/collation/en.
The following example demonstrates including a collation in an entity property definition.
To specify an entity property whose type is complex, such as an object type, define the complex type as an entity type and use an entity type reference in the property definition.
For example, suppose a Person
entity type contains a name property, and that name should have entity properties first, middle, and last. You could model a name as an entity type and then reference it in the definition of Person
similar to the following:
JSON: "name": { "$ref": "#/definitions/Name" } XML: <name><es:ref>#/definitions/Name</es:ref></name>
You can reference entity types defined in the same model (a local reference) or externally. For more details and examples, see Defining Entity Relationships.
To specify an entity property whose type is a list of values, specify array in the datatype
JSON property or XML element of the property definition, and then include an items
type definition that specifies the data type of the list items. For a list of supported item type names, see property_type.
You cannot use an entity property with array type as a primary key or for generating database configuration artifacts such as range index or word lexicon configuration.
For example, the following entity type definition defines an entity property named orders whose value is an array of values of type integer.
For more details, see property_definition.
To model the type of an entity property as an IRI (Internationalized Resource Identifier), specify iri in the datatype
JSON property or XML element of the property definition. IRI-typed entity properties can be useful for working with entities using SPARQL.
The value of a property with IRI type must be a string that represents a sem:iri value. The value is opaque to the Entity Services API.
For example, the following entity type definition contains an entity property name with IRI data type.
For more details about creating Semantic applications in MarkLogic, see the Semantics Developer's Guide.
An entity type definition can designate one entity property as a primary key that uniquely identifies each instance of that type.
The primary key is used in the following ways:
An entity type definition can contain at most one primary key. If you generate a schema from the model, the primary key entity property has its cardinality set to exactly 1; for details, see Generating an Entity Instance Schema.
To specify a primary key, include a primaryKey
JSON property or primary-key
XML element in the entity type definition. The value must be the name of an entity property defined in this type definition. The primary key entity property cannot have array type.
For example, the following definition of a Person entity defines the id entity property as a primary key:
Security policies often require strict access controls for Personally Identifiable Information (PII), such as a telephone number, address, or social security number. Entity Services enables you to tag entity properties as containing PII, and subsequently generate special security configuration to control access to PII data in your entity instances. For more details, see Generating a PII Security Configuration Artifact.
The following example entity type definition flags the name and address entity properties as PII.
By default, all entity properties defined in an entity type are optional. You can identify required properties by including their names in the required
section of your entity type definition. The entity properties named in the required
section must be defined in the containing entity type.
An entity property specified as a primary key is implicitly required, so you should not also include it in the explicit list of required properties.
When you validate an instance against the schema generated for an instance type, validation fails if the instance does not include at least one occurrence of a required entity property. Similarly, when you generate a TDE template for an instance type, required entity properties are not considered nullable.
The following example entity type definition defines 3 entity properties: id, name, and address. The id and name properties are required. The address entity property is optional.
By default, the elements of an XML entity instance are in no namespace. If you include a namespace URI and prefix in your model, then your entity instances names will be qualified by the namespace, as long as you use an XML representation for your envelope documents.
Use of entity type namespaces is optional. If you choose to use a namespace, you must specify both a namespace URI and a prefix in your entity type definition.
In an XML model descriptor, use the following format to define a namespace URI and prefix:
<es:namespace>namespaceURI</es:namespace> <es:namespace-prefix>prefix</es:namespace-prefix>
In a JSON model descriptor, use the following format to define a namespace URI and prefix:
"namespace": "namespaceURI", "namespacePrefix": "prefix"
The following restrictions apply to defining namespace prefix binding. Any model that violates these restrictions will fail validation.
If you define a namespace for an entity type, the Entity Services API uses it when creating XML envelope documents, extracting instances from XML envelopes, and generating model artifacts such as schemas, query options, and TDE templates.
The namespace is discarded when generating JSON envelope documents or extracting an instance from an envelope document as JSON. This means that generated code, query options, and TDE templates based on the model will include XPath expressions that will not match your JSON envelopes or instances without modification.
For example, the following model descriptor specifies that Person entities should be in the namespace http://example.org/es/gs and bind that namespace URI to the prefix esgs:
<es:model xmlns:es="http://marklogic.com/entity-services"> <es:info> <es:title>Person</es:title> <es:version>1.1.0</es:version> <es:base-uri>http://example.org/example-person/</es:base-uri> </es:info> <es:definitions> <Person> <es:properties> <id><es:datatype>string</es:datatype></id> <firstName><es:datatype>string</es:datatype></firstName> <lastName><es:datatype>string</es:datatype></lastName> <fullName><es:datatype>string</es:datatype></fullName> <friends> <es:datatype>array</es:datatype> <es:items><es:ref>#/definitions/Person</es:ref></es:items> </friends> </es:properties> <es:namespace>http://example.org/es/gs</es:namespace> <es:namespace-prefix>esgs</es:namespace-prefix> <es:primary-key>id</es:primary-key> <es:required>firstName</es:required> <es:required>lastName</es:required> <es:required>fullName</es:required> </Person> </es:definitions> </es:model>
The following table illustrates how the envelope documents change, based on whether or not the model defines an entity type namespace.
If you call es:instance-xml-from-document or es.instanceXmlFromDocument on an XML envelope document for an entity type that uses namespaces, the returned instance includes the namespace.
For example, the following instance is extracted from the envelope document shown in the table above. Notice that it uses the esgs namespace.
<esgs:Person xmlns:es="http://marklogic.com/entity-services" xmlns:esgs="http://example.org/es/gs"> <esgs:id>1234</esgs:id> <esgs:firstName>George</esgs:firstName> <esgs:lastName>Washington</esgs:lastName> <esgs:fullName>George Washington</esgs:fullName> </esgs:Person>
The namespace is not preserved when you use JSON envelopes or when you generate a JSON instance from an XML or JSON envelope.
Searchable entity properties should usually be backed by an index or lexicon.
A model descriptor can contain optional range index and word lexicon sections that indicate which entity properties should have an associated range index or word lexicon and search constraint definition. This specification affects generated artifacts such as query options and database configuration.
For more details, see the following topics:
A range index enables range queries over an entity property, such as match all inventory item instances with a price property greater than 5. Range indexes and word lexicons also enable search application features such as faceting and search term suggestions.
The Entity Services modeling language enables you to specify entity type properties that should be backed by an element range index, path range index, or word lexicon. (The element range index specification is applicable to both XML elements and JSON properties.)
To indicate that a property should be backed by a range index, include the following components in your model descriptor:
In JSON, the value of pathRangeIndex
and elementRangeIndex
is an array of entity property names. In XML, define multiple path-range-index
or element-range-index
elements to tag multiple properties. For example:
JSON: "pathRangeIndex": ["price", "rating"] XML: <es:path-range-index>price</es:path-range-index> <es:path-range-index>rating</es:path-range-index>
Note that an element range index is applicable to both XML elements and JSON properties, so your choice of index type is not limited by the representation of your entity instances. For details, see Creating Indexes and Lexicons Over JSON Documents in the Application Developer's Guide.
To specify properties to be backed by a word lexicon, include a wordLexicon
JSON property or word-lexicon
XML element in your model descriptor. In JSON, the value of wordLexicon
is an array of property names. In XML, define multiple word-lexicon
elements to tag multiple properties. The syntax is analogous to the range index example, above.
The properties named in a range index or word lexicon specification must be defined in the containing entity type definition and must conform to certain data type restrictions. For data type details, see Supported Datatypes.
For a complete example, see Example: Identifying Indexable Entity Properties.
Specifying the name of an entity property in the range index section has the following implications:
Specifying the name of an entity property in the word lexicon section has the following implications:
If your model specifies a namespace binding for an entity type and you use JSON envelopes, the namespace is discarded in the JSON representation, but the generated index configuration still assumes a namespace, so the index configuration will not match your JSON data. You should usually use XML envelopes when you include a namespace specifier in your model.
For more details, see Generating a Database Configuration Artifact and Generating Query Options for Searching Instances.
The following example descriptors specify a path range index on the rating entity property and a word lexicon on the bio entity property of a Person entity type.
If you generate database properties from the resulting model (using es:database-properties-generate or es.databasePropertiesGenerate), then the generated database configuration properties include the following details:
{ "database-name":"%%DATABASE%%", ..., "element-word-lexicon":[{ "collation":"http://marklogic.com/collation/en", "localname":"bio", "namespace-uri":"" }], "range-path-index":[{ "collation":"http://marklogic.com/collation/en", "invalid-values":"reject", "path-expression":"//es:instance/Person/rating", "range-value-positions":false, "scalar-type":"float" }], ... }
If you generate query options from the resulting model (using es:search-options-generate or es.searchOptionsGenerate), then the generated options include the following constraint definitions:
<search:options xmlns:search="http://marklogic.com/appservices/search"> ... <search:constraint name="rating"> <search:range type="xs:float" facet="true"> <search:path-index xmlns:es=...> //es:instance/Person/rating </search:path-index> </search:range> </search:constraint> <search:constraint name="bio"> <search:word> <search:element ns="" name="bio"/> </search:word> </search:constraint> ... </search:options>
For details on generating database properties and query options, see Generating Code and Other Artifacts. For details on using the generated artifacts, see Deploying Generated Code and Artifacts and Querying a Model or Entity Instances.
Any property named in a range index specification must have a data type that can be used to define a range index or can be mapped to an indexable super type. You can define a property with any of the data types listed in property_type, but only scalar types can be used to define a range index. For example, you cannot specify a property that has type hexBinary
, an array type, or a reference to another entity type.
For a list of type usable to define range indexes, see Understanding Range Indexes in the Administrator's Guide.
Any entity property specified in the word lexicon section of a model descriptor must have string type, or a type which normalizes to string, such as anyURI
or iri
.
Some datatypes are normalized to a supported index type for purposes of index configuration. For example, the positiveInteger
, negativeInteger
, and integer
datatypes normalize to the XSD decimal
type. The following mapping is used for purposes of index configuration:
The info section of a model descriptor can include an optional base-uri
XML element or baseUri
JSON property. If a base URI is defined, it is used for the following purposes:
If you do not include a base URI definition in your descriptor, Entity Services uses http://example.org/.
For example, the following descriptor defines a base URI of http://my/org/.
If you generate an instance converter module from this descriptor, then the module namespace is created by appending the module title (Example) and version (1.0.0) to the base URI (http://my/org/), as follows:
module namespace example = "http://my/org/Example-1.0.0";
If you did not define a base URI, then the module namespace URI would be http://example.org/Example-1.0.0. For more details on the generated module namespace, see Module Namespace Declaration.
Similarly, when you create a model from the above example descriptor, the base URI is used as an IRI prefix for the generated model and instance triples. For example, the Person entity type defined by the example has the following IRI:
http://my/org/Example-1.0.0/Person
If you do not define a base URI, then the above IRI would be http://example.org/Example-1.0.0/Person
.
The base URI is always combined with other model metadata, such as the model title and version.
You can model relationships between entity types by referencing an entity type in place of a datatype
in the definition of an entity property. This is the $ref
JSON property or es:ref
XML element of the property definition.
References can either be local (identifying a type defined in the same descriptor) or external (identifying a type that cannot be locally resolved by the Entity Services API).
A local entity reference refers to an entity type defined in the current model. A local reference is defined by a relative URI of the following form:
#/definitions/entityTypeName
A local entity reference is resolvable during code generation, such as when you call the es:instance-converter-generate XQuery function or the es.instanceConverterGenerate JavaScript function. This resolvability enables the Entity Services code generation tools to, for example, embed the properties of a local reference inside an instance of the referencing type.
For example, the following model descriptor defines two entity types, Person and Name. The Person entity type definition includes a name entity property that is a reference to the Name entity type. The type of the name property is a local reference.
If you generate an instance converter from this model, the default code template assumes that a Person entity instance has a Name entity instance embedded within it. For example, a Person entity instance generated by es:instance-json-from-document or es.instanceJsonFromDocument might look like the following:
{ "Person": { "id": 1234, "name": { "first": "John", "middle": "NMI", "last": "Smith" } } }
You could also choose to have the Name persisted separately and reference it from a Person entity via a primary key, URI, or other identifier. That is a choice you make when customizing your instance converter. For more details, see Creating an Instance Converter Module.
An external entity reference refers to an entity type defined outside the model. The referenced type is identified by an IRI. The referenced type should be defined elsewhere in MarkLogic. Resolution of the reference is handled by MarkLogic's SPARQL engine.
No validation is performed on the value of an external reference. When you use the Entity Services APIs to generate code and other artifacts, the reference is treated as an opaque string.
For example, the following model descriptor defines a Person entity type that contains a name property that is an external reference to a type identified by http://example.org/Name. This could be an entity type defined by a different Entity Services model.
You would customize your Person
instance converter code to fill in the value of the name property with an appropriate reference or embedded value. Since the shape of the external entity type is not defined by the model, the Entity Services code generation tools cannot assume an embedded object as they can for local references. To learn more about instance generation, see Creating an Instance Converter Module.
Create a model from a JSON or XML descriptor by inserting the descriptor document into the database as part of the special Entity Services collection http://marklogic.com/entity-services/models
.
During insertion, MarkLogic generates a model from the descriptor. The model includes the entity type definitions, properties, and relationships defined by your descriptor, plus facts about the model that MarkLogic automatically infers from the descriptor. These facts are expressed as Semantic triples; for details, see Search Basics for Models. You can also add your own facts; for details, see Extending a Model with Additional Facts.
For example, the following code snippet creates a model from a descriptor. For a more complete example see Getting Started With Entity Services.
Note that if you create a model with an XML descriptor, then you will have to convert the persisted document to its in-memory JSON representation before you can use it with any Entity Services functions that expect a model as input. For details, see Working With an XML Model Descriptor.
The natural representation of a model descriptor in the Entity Services API is a JSON object node. In XQuery, the in-memory JSON representation of a model descriptor is as a json:object (a special kind of map:map). The equivalent representation in Server-Side JavaScript is a JSON object node or JavaScript object. (MarkLogic implictly converts JavaScript objects to JSON objects when you pass them as parameters.)
If you create a model by persisting an XML descriptor, you must convert the persisted descriptor into its JSON representation before you can pass it to most Entity Services functions. You can create a JSON object from an XML descriptor using the following functions:
To learn more about descriptor validation, see Validating a Model Descriptor.
The following example code snippet generates an instance converter module from an XML descriptor by first converting the descriptor to JSON. Assume /es-gs/models/person-1.0.0.xml
is previously persisted descriptor used to create a model.
If you persist your XML descriptor as JSON instead of XML, then you only need to do the conversion once, at model creation time. This is the technique used in Create a Model.
In XQuery, you can manipulate the JSON representation of the descriptor as a map:map
; for details, see Building a JSON Object from a Map.
To validate a model descriptor, use the es:model-validate XQuery function or the es.modelValidate Server-Side JavaScript function.
If the input descriptor is valid, this function returns a valid JSON descriptor that can be persisted in the database or used as input to any Entity Services interfaces that accepts a model as input. If the input descriptor is invalid, this function throws an ES-MODEL-INVALID
exception and reports the validation failures in the error details.
Since an invalid model descriptor produces an invalid model, you should use model validation during development. Model validation does introduce added overhead, however, so you might choose to skip it when going between a descriptor and a model in production situations.
The following example validates a simple model descriptor containing a Person entity type definition. The model descriptor is valid, so no exception is raised, and the returned model is identical to the JSON model descriptor used in the JavaScript example.
If we introduce an error by specifying that an undefined entity property named UNDEF is a required property, then validation raises an error similar to the following:
ES-MODEL-INVALID (err:FOER0000): "Required" property UNDEF doesn't exist.
You can extend your model with information and relationships that cannot be expressed in or derived from the model descriptor by storing additional semantic triples related to your model in MarkLogic.
You can use the model, entity type, and property IRIs generated by Entity Services to express these new facts. Entity Services uses the following patterns for constructing IRIs when generating RDF triple data about a model:
/
modelTitle-
modelVersion/
typeName/
propertyNameFor example, suppose you have the following model descriptor:
{ "info": { "title": "People", "version": "1.0.0", "baseUri": "http://marklogic.com/example/" }, "definitions": { "Person": { "properties": { "id": { "datatype": "int" }, "name": { "datatype": "string" }, } } } }
Then the following IRIs are generated and used by Entity Services:
http://marklogic.com/example/People-1.0.0
http://marklogic.com/example/People-1.0.0/Person
http://marklogic.com/example/People-1.0.0/Person/name
You can use any of MarkLogic's Semantic capabilities to add, manage, and query triples you add to your model, including embedding triples in your entity instance envelope documents and customizing the TDE template you can generate with Entity Services. You can also use the model IRI as named graph IRI for integrating separate triples-based modeling with an Entity Services model.
For more information about using Semantics with MarkLogic, see Semantics Developer's Guide.
Some kinds of changes do not affect the structure and content of your instances. For example, if you decide to index a property that was not previously indexed or change a property from required to optional, your instances will not change.
However, changes such as the following typically impact the content in your instances, application code, and generated artifacts:
Entity Services can help you update your application as your model evolves.
When integrating model changes, you must decide if all consumers of your instance data will move to the new model at the same time, or if you need to support both old and new models during some transition period. You must also choose how to generate instances based on your new model version.
See the following topics for more details:
For an end to end example of updating a model version, see example-versions
in the Entity Services examples on GitHub. For more details, see Exploring the Entity Services Open-Source Examples.
You can upgrade your instance data using one of the following strategies:
What you do with the instance data based on the new model depends on your version transition strategy. For details, see Replacing the Old Model with a New Version and Making Multiple Model Versions Available.
You should use a version translator if re-extraction is not practical. A version translator is also useful for creating in-memory instances of a different version to return to downstream consumers. For example, if you've advanced your content to v2 of your model, you could use a v2-to-v1 translator to synthesize v1 instances for v1 clients.
Both the instance converter and the version translator can be generated using the Entity Services API.
To re-extract instances from original source, generate, customize, and install an instance converter based on the new model, as described in Creating an Instance Converter Module. Send your raw source data through the converter, just as you did with the previous model version.
To use a version translator to generate new version instances from old ones, generate, customize, and install a version translator module from the old and new models as described in Creating a Model Version Translator Module. Then, use the translator to convert instance data from the old model to the new one.
The following diagram illustrates using a version translator to generate an envelope document containing an instance based on a new model version. You can also pass just an instance (rather than an envelope document) to the translator.
If all consumers will immediately move to the new model then you can do the following to update your model-based artifacts:
Note that you might still be able to serve old version instances to clients by using a down-converting version translator to convert new instances to old ones during extraction. You can generate such a translator using Entity Services; for details, see Creating a Model Version Translator Module.
When maintaining multiple model versions, the procedures are similar to those described in Replacing the Old Model with a New Version, but you must consider how to manage multiple versions of your code and configuration artifacts, such as the following:
You must choose an approach to storing your updated instance data in the database. You might use one of the following approaches to managing versions:
In the first approach, the database contains envelope documents for instances based on both model versions, as shown in the following diagram:
In this case, putting the envelope documents in different collections based on version will make them easier to manage and search. You can also use the value of es:instance/es:info/es:version
to distinguish between versions.
In the second approach, the database still contains only one set of envelope documents, but each envelope contains multiple instances, as shown in the following diagram:
You can use the value of es:instance/es:info/es:version
to distinguish between versions during search and entity extraction. Your instance converter must be customized to store multiple instances in a single envelope.
This topic refers to maintaining more than one version of the schemas generated by the es:schema-generate XQuery function or the es.schemaGenerate Server-Side JavaScript function.
It is usually best to avoid multiple schemas for the same type name. Schema validation is based on type name, so if you do not explicitly specify which schema to use for validation you won't know which schema is applied.
During explicit validation in XQuery, you can import a schema into your evaluation context. For example, if you have v1.0.0 and v2.0.0 schemas installed for a model that defines a Person
entity type, then you could force validation against the v2.0.0 model by doing the following:
xquery version "1.0-ml"; import module namespace es = "http://marklogic.com/entity-services" at "/MarkLogic/entity-services/entity-services.xqy"; import schema default element namespace "" at "/es-gs/person-2.0.0.xsd"; xdmp:validate( es:instance-xml-from-document( fn:doc('/es-gs/envelopes/1234.xml')), 'type', xs:QName('PersonType'))
For XML instance representations, you can add @schemaLocation
to control which schema is applied. For more details, see Referencing Your Schema in the Application Developer's Guide.
The triples generated from a TDE template generated by Entity Services use a subject IRI that includes the model version. Therefore, there is no collision between the facts generated from each template version.
However, both templates will use the same row schema-name
for the same entity types, which will cause row searches to return the union of matched by both templates. To avoid this, you should give each entity type row schema a unique name.
You can merge old and new version query options together, or keep them separate and use the version appropriate for entity instance versions you're searching.
If you choose to keep multiple versions of canonical instances in a single envelope document, you should probably modify your query options to include version related constraints or additional queries.
For example, you might want to add a version constraint based on es:envelope/es:instance/es:info/es:version
.
The database configuration is single-state. You can configure the union of range indexes and word lexicons defined by the two models.
You should usually not remove a range index or word lexicon required by the older model if you wish to continue supporting searches on that version. Also, if you define a range index or word lexicon for a property that exists in both model versions, you might see different search results against the old version entities because queries against the shared property can now be resolved out of the index.
This section provides a detailed description of the layout of a model descriptor, including syntax, component descriptions, and examples. A model descriptor has the following top level structure, where the info
section contains model metadata, and the definitions
section contains your entity type definitions. A model descriptor must define at least one entity type.
JSON | XML |
---|---|
{ "info": model_info, "definitions": { entity_type_definition, ... } } |
<es:model xmlns:es="http://marklogic.com/entity-services"> <es:info>model_info</es:info> <es:definitions> entity_type_definition ... <es:definitions> </es:model> |
To explore the component parts of a model descriptor in more detail, see the following topics:
The info section of a model descriptor contains model metadata, such as a description or version.
The info section of a model descriptor has the following structure:
The info section of a model descriptor can contain the following XML elements or JSON properties. Title and version are the only required items.
Property Name | Description |
---|---|
title |
Required. The title of this model descriptor. The title string must be a valid XQuery namespace prefix. If you plan to generate a TDE template from the model, you should avoid using hyphens (-) in the title. Hyphens will be converted to underscores (_) in the TDE schema, view, and column names, in order to avoid invalid SQL names. The title is used as the module namespace prefix when generating code from the model. If the first character of the title is upper case, it will be converted to lower space when used as namespace prefix. |
version |
Required. The version of this model descriptor. Best practice is to use the semver format, such as 1.0.0; for details, see http://semver.org/. The version number of the model is considered the version number of all the entity types defined within the model. |
baseUri (JSON) base-uri (XML) |
Optional. A valid absolute URI, usable to resolve RDF values in the descriptor. If this entity property is not present, http://example.org/ is used as the default URI. For details, see Controlling the Model IRI and Module Namespaces. |
description |
Optional. A description of this set of entity type definitions. This is purely information metadata. |
The following example contains an info
section that uses all available properties. Only the title
and version
properties are required.
An entity type definition is a child of the definitions section of a model descriptor. A model descriptor must include at least one entity type definition, and may contain multiple entity type definitions.
An entity type definition has the following structure, where entityTypeName (in JSON) and entity-type-name (in XML) represent the user-defined entity type name, such as Person or Order. By convention, entity type names begin with a capital letter (Person, not person).
If you plan to generate a TDE template from the model, you should avoid using hyphens (-) in the entity type and entity property names. Hyphens will be converted to underscores (_) in the TDE schema, view, and column names, in order to avoid invalid SQL names.
JSON | XML |
---|---|
entityTypeName : { "properties": { propertyName: property_definition, ... }, "required": [ string ], "primaryKey": string, "namespace": string, "namespacePrefix": string, "pii": [ string ], "pathRangeIndex": [ string ], "elementRangeIndex": [ string ], "rangeIndex": [ string ], "wordLexicon": [ string ] "description": string } |
<entity-type-name xmlns:es="http://marklogic.com/entity-services"> <es:properties> <property-name> property_definition </property-name> ... </es:properties> <es:required>property name</es:required> <es:primary-key> property name </es:primary-key> <es:namespace>namespace URI</es:namespace> <es:namespace-prefix> namespace prefix </es:namespace-prefix> <es:pii>property name</es:pii> <es:path-range-index> property name </es:path-range-index> <es:element-range-index> property name </es:element-range-index> <es:range-index> property name </es:range-index> <es:word-lexicon> property name </es:word-lexicon> <es:description>type desc</es:description> </entity-type-name> |
An entity type definition can contain the following XML elements or JSON properties.
Property Name | Description |
---|---|
properties |
Optional. Zero or more entity property definitions. Each child JSON property or XML element name is the name of a property of the entity type. In XML, the element name must not be namespace qualified. For more details, see Writing a Model Descriptor. |
description |
Optional. A description of this entity type. |
required |
Optional. Specify the names of entity properties that must be in every instance of this entity type. In XML, include multiple required elements to specify multiple required property names. Each named entity property must match the name of an entity property defined in the properties section of this entity type definition. Any entity properties not tagged as required are treated as optional. For more details, see Distinguishing Required and Optional Entity Properties. |
primaryKey (JSON) primary-key (XML) |
Optional. The name of an entity property to use as a primary key when generating artifacts such as an extraction template. The value must match the name of an entity property defined in the properties section of this entity type definition. There can be at most one primary key in an entity type definition. The primary key property is implicitly also a required property. For more details, see Identifying the Primary Key Entity Property. |
namespace |
Optional. A namespace URI with which to qualify canonical XML entity instances of this type. If you include a namespace URI, you must also define a namespace prefix using the namespace-prefix XML element or namespacePrefix JSON property. The namespace is also used in generated database configuration and query options artifacts. For details and restrictions, see Defining a Namespace URI for an Entity Type. |
namespacePrefix (JSON) namespace-prefix (XML) |
Optional. A namespace prefix to bind to the XML namespace defined by the namespace XML element or JSON property. You must define a prefix if you define a namespace. For details and restrictions, see Defining a Namespace URI for an Entity Type. |
pii |
Optional. The name(s) of entity properties that can contain Personally Identifiable Information (PII). You can generate an Element Level Security (ELS) configuration to more tightly restrict acess to PII properties than access to other instance properties. For details, see Identifying Personally Identifiable Information (PII). In XML, include multiple pii elements to specify multiple properties. |
pathRangeIndex (JSON) path-range-index (XML) |
Optional. The name(s) of entity properties that should be backed by a path range index. This affects the database configuration and query options you can generate from a model. Each named property must match the name of an entity property defined in the properties section of this entity type definition. In XML, include multiple path-range-index elements to specify multiple properties. For more details, see Identifying Entity Properties for Indexing. |
elementRangeIndex (JSON) element-range-index (XML) |
Optional. The name(s) of entity properties that should be backed by an element range index. This affects the database configuration and query options you can generate from a model. Each named property must match the name of an entity property defined in the properties section of this entity type definition. In XML, include multiple element-range-index elements to specify multiple properties. For more details, see Identifying Entity Properties for Indexing. |
rangeIndex (JSON) range-index (XML) |
Optional. Deprecated. Equivalent to the pathRangeIndex property in a JSON descriptor, or the path-range-index element in an XML descriptor. |
wordLexicon (JSON) word-lexicon (XML) |
Optional. The name(s) of entity properties that should be backed by a word lexicon. This affects the database configuration and query options you can generate from a model. Each named property must match the name of an entity property defined in the properties section of this entity type definition. In XML, include multiple word-lexicon elements to specify multiple properties. For details, see Identifying Entity Properties for Indexing. |
The following example defines a Person entity type that contains entity properties named id, name, bio, and rating. The id and name properties are required. The id entity property is a primary key. A path range index is required for id and rating, and a word lexicon is required for bio. The name property is tagged as PII.
An entity property definition is a child of the entity_type_definition section of a model descriptor. Each entity type must include at least one entity property definition.
An entity property definition can have one of the following forms. Entity property definition are used in the properties
child of an entity_type_definition.
JSON | XML |
---|---|
{ "datatype" : "string", "collation": string, "description": string } { "datatype" : "array", "items": property_definition , "description": string } { "datatype" : property_type, "description": string } { "$ref": string, "description": string } |
<!-- string-valued entity property --> <es:datatype>string</es:datatype> <es:collation> collationUri </es:collation> <es:description>desc</es:description> <!-- array/list-valued property --> <es:datatype>array</es:datatype> <es:items>property_definition</es:items> <!-- prop of any other type --> <es:datatype>property_type</es:datatype> <es:description>desc</es:description> <!-- ref to another entity type --> <es:ref>type path ref</es:ref> <es:description>desc</es:description> |
This portion of a model descriptor can contain the following XML elements or JSON properties. An entity property definition must include either a datatype
or ref
JSON property or XML element, but not both.
Property Name | Description |
---|---|
datatype |
Required if $ref (JSON) or es:ref (XML) is not present. The data type of values in this entity property. The value must be one of the types listed in property_type. The datatype can affect what other JSON properties or XML elements can be included in this definition, such as a datatype of string enabling the inclusion of a collation URI in the property definition. |
$ref (JSON ref (XML) |
Required if datatype is not present. A reference to another entity type, in the form of either a relative path to an entity type defined in this model descriptor or an absolute IRI. The value must end in a simple type name so that it can be treated as a type name during code generation. For details, see Defining Entity Relationships and Defining an Entity Property with a Complex Type. |
collation |
Optional. Only usable when the value of datatype is string . The collation to use when generating index/lexicon configuration and query options. If you do not specify a collation, then it defaults to http://marklogic.com/collation/en . |
items |
Required when the value of datatype is array . The type definition for the array items. The value is itself an entity_type_definition. For details, see Defining an Entity Property with Array Type. |
description |
Optional. A description of this entity type. |
The following example defines a Person entity type with 3 entity properties: An id of type int, a name with type string whose definition includes a collation, and a friend entity property with array type. Each item value in the friend array is a reference to a Person
entity.
This section defines the type names that can be specified in the datatype
JSON property or XML element of an entity property definition. With the exception of iri and array, these types correspond to XML Schema Definition Language (XSD) of the same name; for details, see https://www.w3.org/TR/xmlschema11-2/#built-in-datatypes.
Not all these datatypes are usable as range index or word lexicon types. If you specify an entity property with an incompatible type in the range index or word lexicon specification of an entity type definition, then the resulting model will not validate.
An array-typed entity property contains an item type definition that also uses this type list. For details, see property_definition and Defining an Entity Property with Array Type.
Some types are folded into a compatible super-type when defining range indexes. For example, entity properties of type iri are indexed as string, and entity properties of type byte or short are indexed as int. Some data type cannot be used for index or word lexicon configuration.
For more details, see the following topics: