Entity Services Developer's Guide (PDF)

MarkLogic 9 Product Documentation
Entity Services Developer's Guide
— Chapter 5

« Previous chapter
Next chapter »

Managing Entity Instances

This chapter describes how to create, retrieve, update, and delete entity instances derived from a model created with MarkLogic Entity Services. The chapter covers the following topics:

Entity Instance Concepts

This section introduces entity instance concepts helpful in creating, persisting, querying, and extracting entity instance data. The following topics are included:

What is an Instance?

An entity instance is a concrete instantiation of an entity type defined in a model.

For example, suppose you have a JSON model descriptor that defines a Person entity type with the following properties. This is based on the model in Getting Started With Entity Services.

"Person": {
  "properties": {
    "id": {"datatype": "string"}, 
    "firstName": {"datatype": "string"}, 
    "lastName": {"datatype": "string"}, 
    "fullName": {"datatype": "string"}, 
    "friends": {
      "datatype": "array", 
      "items": {"$ref": "#/definitions/Person"
    }
  }}, 
  ...
}

Then the canonical representation of a Person instance would have the following form, depending on whether you choose to work with XML or JSON.

XML Canonical Form JSON Canonical Form
<Person>
  <id>1234</id>
  <firstName>George</firstName>
  <lastName>Washington</lastName>
  <fullName>George Washington</fullName>
</Person>
{"Person": {
  "id":"2345", 
  "firstName":"Martha", 
  "lastName":"Washington", 
  "fullName":"Martha Washington"
}}

By convention, an instance is stored as child XML elements or JSON properties of an envelope document. You can extract an instance from an envelope as XML or JSON, regardless of the envelope format. For details, see What is an Envelope Document? and Extracting an Entity Instance from an Envelope Document.

An instance can have multiple repesentations, depending on the context:

  • While you are synthesizing an instance from raw source or converting one between model versions, you work with an in-memory representation of the instance as a map:map containing not only the entity type property values, but additional information such as type and source. This representation is designed to be easy to modify during instance construction.
  • By Entity Services convention, instances are persisted in envelope documents. An XML envelope document includes an es:instance XML element with a child element that is the canonical XML representation of the instance. A JSON envelope document contains an "instance" property that contains the canonical JSON representation of the instance. The canonical representation is the one on which queries are based. For details, see What is an Envelope Document?.
  • You can extract an instance from an envelope document as XML, JSON, or a map:map. You might use one or more of these representations to pass instances to downstream applications. For details, see Extracting an Entity Instance from an Envelope Document.

For more details, see Example: Entity Instance Representations.

What is an Envelope Document?

If you follow the Entity Services conventions, your entity instances are persisted in MarkLogic as part of an envelope document. An envelope document encapsulates instance data with related metadata that might be useful to your application. You can use either XML or JSON envelopes.

An envelope document for some entity type T is created using the instance-to-envelope function in T's instance converter module. For more details, see Creating an Entity Instance from a Data Source and Creating an Instance Converter Module.

An envelope document has the following form by default.

Format Envelope Template
XML
<es:envelope xmlns:es="http://marklogic.com/entity-services">
  <es:instance>
    <es:info>
      <es:title>model title</es:title>
      <es:version>model version</es:version>
    </es:info>
    <T>
      ...T's entity properties as elements...
    </T>
  </es:instance>
  <es:attachments>...source data...</es:attachments>
</es:envelope>
JSON
{"envelope": {
  "instance": {
    "info": {
      "title": "model title",
      "version": "model version"
    },
    "T": {
      ...T's entity properties as JSON properties...
    }
  },
  "attachments": [ ...source data... ]
}}

The instance section contains the canonical representation of the instance, plus metadata such as the model title and version from which entity type is derived. The attachments section contains the source data, by convention; you can add additional attachments.

The envelope format does not have to match the format of your raw source data. You can generate JSON envelopes for instances based on XML source and vice versa. However, if the source and envelope formats differ, the raw source is stored in the attachments section of the envelope as a string.

You can customize an envelope document to include other information, but you should generally not modify the instance portion. The instance data should accurately reflect the entity type definition in your model. If you need to normalize or derive property values, do so in the extract-instance-T function of your instance converter.

If you customize the envelope by adding data to the attachments element, then you can use the es:instance-get-attachments XQuery function or the es.instanceGetAttachments JavaScript function to retrieve the data. If you put it elsewhere in the envelope, then you are solely responsible for retrieving it from the envelope.

The Entity Services API includes functions for retrieving the instance data and attachments from an envelope. For details, see Extracting an Entity Instance from an Envelope Document and Extracting the Original Source from an Envelope Document.

Example: Entity Instance Representations

This example illustrates the various instance representations discussed in What is an Instance?.

XML Entity Instance Representations

This example uses the Person entity type from the model defined in Getting Started With Entity Services.

Representation Example
1 Raw Source
<person>
  <pid>1234</pid>
  <given>George</given>
  <family>Washington</family>
</person>
2

In-memory instance, as returned by extract-instance-Person

Shown here as JSON for readability, but really a json:object (map:map) with keys $attachments, $type, id, etc.

{"$attachments": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<person>\n  <pid>1234</pid>\n  <given>George</given>\n  <family>Washington</last>\n</family>", 
  "$type": "Person", 
  "id": "1234", 
  "firstName": "George", 
  "lastName": "Washington", 
  "fullName": "George Washington"
}
3

Canonical XML instance generated by instance-to-canonical

Used to construct the instance within an envelope document.

<Person>
  <id>1234</id>
  <firstName>George</firstName>
  <lastName>Washington</lastName>
  <fullName>George Washington</fullName>
</Person>
4 Envelope document, as generated by instance-to-envelope
<es:envelope
   xmlns:es="http://marklogic.com/entity-services">
  <es:instance>
    <es:info>
      <es:title>Person</es:title>
      <es:version>1.0.0</es:version>
    </es:info>
    <Person>
      <id>1234</id>
      <firstName>George</firstName>
      <lastName>Washington</lastName>
      <fullName>George Washington</fullName>
    </Person>
  </es:instance>
  <es:attachments>
    <person>
      <pid>1234</pid>
        <first>George</first>
        <last>Washington</last>
    </person>
  </es:attachments>
</es:envelope>
5

json:object (map:map) representation extracted from envelope document by es:instance-from-document or es.instanceFromDocument

Shown here as JSON for readability, this is really a map:map in XQuery. In JavaScript, this function returns a JavaScript object. The value is mutable.

{ "id": "1234", 
  "firstName": "George", 
  "lastName": "Washington", 
  "fullName": "George Washington",
  "$type": "Person"
}
6

XML representation extracted from envelope document by es:instance-xml-from-document or es.instanceXmlFromDocument

The value is immutable.

<Person>
  <id>1234</id>
  <firstName>George</firstName>
  <lastName>Washington</lastName>
  <fullName>George Washington</fullName>
</Person>
7

JSON representation extracted from envelope document by es:instance-json-from-document or es.instanceJsonFromDocument

This function returns a JSON object node. The value is immutable.

{ "Person": {
    "id": "1234", 
    "firstName": "George", 
    "lastName": "Washington", 
    "fullName": "George Washington"
} }

The representations you see on lines 2, 3, and 4 were created by an instance converter module. For details, see Creating an Instance Converter Module. The representation on line 2 is a transient, mutable in-memory representation designed for ease of use in instance converter code. If you pass an envelope document to the convert-instance-T function of a version translator module, it returns a similar representation; for details, see Creating a Model Version Translator Module.

The envelope document representation on line 4 is the recommended way to store entity instances in MarkLogic. You can customize the contents of your envelope, but should usually leave the es:instance portion as-is. This is the layout produced by the instance-to-envelope function of an instance converter.

The representations on lines 5, 6, and 7 are instances extracted from an envelope document using the Entity Services API. The map:map representation on line 5 differs from the other extracted entities in that it is mutable and carries explicit type information in the $type property. This representation differs from the one on line 2 in that it contains only the instance entity type properties. There is no $attachments. For more details, see Extracting an Entity Instance from an Envelope Document.

JSON Entity Instance Representations

This example uses the Person entity type from the model defined in Getting Started With Entity Services.

Representation Example
1 Raw Source
{ "pid": 2345, 
  "given": "Martha", 
  "family": "Washington"
}
2

In-memory instance, as returned by extract-instance-Person

Shown here as JSON for readability, but really a json:object (map:map) with keys $attachments, $type, id, etc.

{ "$type": "Person",
  "$attachments": {
    "pid": 2345,
    "given": "Martha",
    "family": "Washington"
  },
  "id": 2345,
  "firstName": "Martha",
  "lastName": "Washington",
  "fullName": "Martha Washington"
}
3

Canonical JSON instance generated by instance-to-canonical

Used to construct the instance within an envelope document.

{"Person": {
  "id":"2345", 
  "firstName": "Martha",
  "lastName": "Washington", 
  "fullName": "Martha Washington"
}}
4 JSON Envelope document, as generated by instance-to-envelope
{"envelope": {
  "instance": {
    "info": {
      "title": "Person",
      "version": "1.0.0"
    },
    "Person": {
      "id": "2345",
      "firstName": "Martha",
      "lastName": "Washington",
      "fullName": "Martha Washington"
    }
  },
  "attachments": [ {
    "pid": 2345,
    "given": "Martha",
    "family": "Washington"
  } ]
} }
5

json:object (map:map) representation extracted from envelope document by es:instance-from-document or es.instanceFromDocument

Shown here as JSON for readability, this is really a map:map in XQuery. In JavaScript, this function returns a JavaScript object. The value is mutable.

{ "$type": "Person", 
  "id":"2345", 
  "firstName":"Martha",
  "lastName":"Washington", 
  "fullName":"Martha Washington"
}
6

XML representation extracted from envelope document by es:instance-xml-from-document or es.instanceXmlFromDocument

The value is immutable.

<Person>
  <id>2345</id>
  <firstName>Martha</firstName>
  <lastName>Washington</lastName>
  <fullName>Martha Washington</fullName>
</Person>
7

JSON representation extracted from envelope document by es:instance-json-from-document or es.instanceJsonFromDocument

This function returns a JSON object node. The value is immutable.

{ "Person": {
  "id":"2345", 
  "firstName":"Martha",
  "lastName":"Washington", 
  "fullName":"Martha Washington"
}}

The representations you see on lines 2, 3, and 4 were created by an instance converter module. For details, see Creating an Instance Converter Module. The representation on line 2 is a transient, mutable in-memory representation designed for ease of use in instance converter code. If you pass an envelope document to the convert-instance-T function of a version translator module, it returns a similar representation; for details, see Creating a Model Version Translator Module.

The envelope document representation on line 4 is the recommended way to store entity instances in MarkLogic. You can customize the contents of your envelope, but should usually leave the instance portion as-is. This is the layout produced by the instance-to-envelope function of an instance converter.

The representations on lines 5, 6, and 7 are instances extracted from an envelope document using the Entity Services API. The map:map representation on line 5 differs from the other extracted entities in that it is mutable and carries explicit type information in the $type property. This representation differs from the one on line 2 in that it contains only the instance entity type properties. There is no $attachments property. For more details, see Extracting an Entity Instance from an Envelope Document.

Creating an Entity Instance from a Data Source

The Entity Services API does not dictate how you create an entity instance from source data, but the recommended process is as follows:

  • Generate, customize, and install an instance converter module, as described in Creating an Instance Converter Module.
  • Use the extract-instance-T and instance-to-envelope functions of the instance converter module to create instance envelope documents for some entity type T from source data.
  • Insert your envelope documents in the database.

By convention, instances are stored as child elements of an XML or JSON envelope document. You can extract an instance from an envelope document in several formats. For details, see Extracting an Entity Instance from an Envelope Document.

The following code illustrates one way to create envelope documents from raw source. In this example, the source data comes from documents in MarkLogic that are in a collection named raw, and instances are generated for an entity type named Person. The generated envelope documents are in XML format; you could also choose JSON. This example uses the converter and data from Getting Started With Entity Services.

Language Example
XQuery
(: Create envelope documents from raw source documents :)
xquery version "1.0-ml";
import module namespace es = "http://marklogic.com/entity-services"
    at "/MarkLogic/entity-services/entity-services.xqy";
import module namespace person =
    "http://example.org/example-person/Person-1.0.0"
    at "/es-gs/person-1.0.0-conv.xqy";

for $source in fn:collection('raw') return
  let $instance := person:extract-instance-Person($source)
  let $uri := 
    fn:concat('/es-gs/env/', map:get($instance, 'id'), '.xml')
  return xdmp:document-insert(
    $uri,
    person:instance-to-envelope($instance, "xml"),
    <options xmlns="xdmp:document-insert">
      <collections>
        <collection>person-envelopes</collection>
      </collections>
    </options>
  )
JavaScript
'use strict';
declareUpdate();
const es = require('/MarkLogic/entity-services/entity-services.xqy');
const person = require('/es-gs/person-1.0.0-conv.xqy');

for (const source of fn.collection('raw')) {
  let instance = person.extractInstancePerson(source);
  let uri = '/es-gs/env/' + instance.id + '.xml';
  xdmp.documentInsert(
    uri, person.instanceToEnvelope(instance, 'xml'),
    {collections: ['person-envelopes']}
  );
}

The resulting envelope documents have the following form by default. The instance data is accessible in an envelope document via the XPath expression //es:instance (or //*:instance). The original source from which the instance was derived is accessible via the XPath expression //es:attachments (or //*:attachments).

<es:envelope xmlns:es="http://marklogic.com/entity-services">
  <es:instance>
    <es:info>
      <es:title>Person</es:title>
      <es:version>1.0.0</es:version>
    </es:info>
    <Person>
      <id>1234</id>
      <firstName>George</firstName>
      <lastName>Washington</lastName>
      <fullName>George Washington</fullName>
    </Person>
  </es:instance>
  <es:attachments>
    <person>
      <pid>1234</pid>
      <given>George</given>
      <family>Washington</family>
    </person>
  </es:attachments>
</es:envelope>

If you generate JSON envelopes rather than XML envelopes, you get envelopes of the following form by default. The instance data is accessible in an envelope document via the XPath expression //instance (or //*:instance). The original source from which the instance was dervied is accessible via the XPath expression //attachements (or //*:attachments).

{ "envelope": {
  "instance": {
    "info": {
      "title": "Person",
      "version": "1.0.0"
    },
    "Person": {
      "id": "1234",
      "firstName": "George",
      "lastName": "Washington",
      "fullName": "George Washington"
    }
  },
  "attachments": [
    "<person><pid>1234<\/pid><given>George<\/given><family>Washington<\/family><\/person>"
    ]
} }

If your model specifies a namespace binding for an entity type and you use JSON envelopes, the namespace is discarded in the JSON representation, but the code and configuration artifacts still assumes a namespace, so it will not work properly with JSON envelope documents. You should use XML envelope documents for entity types that define a namespace binding.

For an end-to-end example of creating envelope documents using this model, see Getting Started With Entity Services.

Generating Test Entity Instances

You can generate test instances from a model using the es:model-get-test-instances XQuery function or es.modelGetTestInstances Server-Side JavaScript function. You can use test instances for tasks such as experimenting with model refinement and testing code that manipulates instances.

The test instances are based purely on the model and do not reflect data normalization or customization you add to your instance converter. The test instances can help you identify properties for which converter customization is required.

The es:model-get-test-instances and es.modelGetTestInstances functions return a sequence of instances, one for each entity type defined in the input model.

If an entity type property definition contains a local reference, the referenced entity type is assumed to be embedded in the referencing entity. If an entity type property definition contains an external reference, no meaningful test value can be generated.

For example, assume the following model defining two entity types, Name and Person. A Person contains a local reference to a Name.

{ "info": {
    "title": "Example",
    "version": "1.0.0",
    "description": "ES Examples"
  },
  "definitions": {
    "Name": {
      "properties": {
        "first": { "datatype": "string" },
        "last": { "datatype": "string" }
      }
    },
    "Person": {
      "properties": {
        "id": { "datatype": "int" },
        "name": { "$ref": "#/definitions/Name" },
      }
} } }

If you generate test instances from this model, the name property of the Person test instance contains a Name instance value:

<Person>
  <id>123</id>
  <name>
    <Name>
      <first>some string</first>
      <last>some string</last>
    </Name>
  </name>
</Person>

If the name property of a Person entity was an external reference to such as http://example.com/SomeType instead, then no meaningful test value can be generated. The Person test instance would look like the following:

<Person>
  <id>123</id>
  <name><SomeType>externally-referenced-instance</SomeType></name>
</Person>

To generate instances from real source data, use an instance converter. For more details, see Creating an Instance Converter Module and Creating an Entity Instance from a Data Source.

Extracting an Entity Instance from an Envelope Document

Though Entity Services encourages storing your instances in MarkLogic in the form of envelope documents, downstream consumers of your data, such as client applications, will probably expect to receive the canonical instance data, not the entire envelope.

The Entity Services API includes the following XQuery functions for extracting an instance from an envelope document. The corresponding JavaScript functions follow.

XQuery Function Extracted Instance Format
es:instance-from-document
map:map (json:object, mutable)
es:instance-json-from-document
object-node() (immutable)
es:instance-xml-from-document
element() (immutable)

The Entity Services API includes the following Server-Side JavaScript functions for extracting an instance from an envelope document.

JavaScript Function Extracted Instance Format
es.instanceFromDocument
JavaScript object (mutable)
es.instanceJsonFromDocument
object-node() (immuntable)
es.instanceXmlFromDocument
element() (immuntable)

For example, suppose you have the following envelope document in the database with the URI /es-gs/env/1234.xml:

<es:envelope xmlns:es="http://marklogic.com/entity-services">
  <es:instance>
    <es:info>
      <es:title>Person</es:title>
      <es:version>1.0.0</es:version>
    </es:info>
    <Person>
      <id>1234</id>
      <firstName>George</firstName>
      <lastName>Washington</lastName>
      <fullName>George Washington</fullName>
    </Person>
  </es:instance>
  <es:attachments>
    <person>
      <pid>1234</pid>
      <given>George</given>
      <family>Washington</family>
    </person>
  </es:attachments>
</es:envelope>

Then, the following code snippet extracts an instance from the envelope document as a json:object in XQuery or a JavaScript object in JavaScript.

Language Example
XQuery
xquery version "1.0-ml";
import module namespace es = "http://marklogic.com/entity-services"
  at "/MarkLogic/entity-services/entity-services.xqy";

es:instance-from-document(fn:doc('/es-gs/env/1234.xml'))[1]
JavaScript
'use strict';
const es = require('/MarkLogic/entity-services/entity-services.xqy');

fn.head(
  es.instanceFromDocument(cts.doc('/es-gs/env/1234.xml'))
);

The result is a sequence containing one item, equivalent to the following JSON:

{ "id":"1234", 
  "firstName":"George", 
  "lastName":"Washington", 
  "fullName":"George Washington",
  "$type": "Person"
}

The following table illustrates the result of calling each of the instance envelope extraction functions.

Function Result
es:instance-from-document

es.instanceFromDocument
A json:object (XQuery) or JavaScript object (JavaScript) equivalent to the following:
{ "id":"1234", 
  "firstName":"George", 
  "lastName":"Washington", 
  "fullName":"George Washington",
  "$type":"Person"
}
es:instance-json-from-document

es.instanceJsonFromDocument
A JSON object-node() equivalent to the following:
{ "Person": {
    "id":"1234", 
    "firstName":"George", 
    "lastName":"Washington", 
    "fullName":"George Washington"
}
es:instance-xml-from-document

es.instanceXmlFromDocument
The following XML element:
<Person xmlns:es=...>
  <id>1234</id>
  <firstName>George</firstName>
  <lastName>Washington</lastName>
  <fullName>George Washington</fullName>
</Person>

For more detailed coverage of instance representations, see What is an Instance? and Example: Entity Instance Representations.

Extracting the Original Source from an Envelope Document

If you follow the Entity Services conventions, an envelope document encapsulates both the canonical instance data and the raw source from which it was derived. This encapsulation happens when you call the instance-to-envelope XQuery function in a model's generated instance converter module.

You can extract the attachments from an envelope document using the es:instance-get-attachments XQuery function or the es.instanceGetAttachments JavaScript function. You can use these function on a customized envelope, as long as the attacments are locatable via the XPath expression //es:attachments.

The raw source data is saved in the envelope as an attachment. For example, the highlighted <person/> element below is the raw XML source from which the enveloped instance was derived.

<es:envelope xmlns:es="http://marklogic.com/entity-services">
  <es:instance>...</es:instance>
  <es:attachments>
    <person>
      <pid>1234</pid>
      <given>George</given>
      <family>Washington</family>
    </person>
  </es:attachments>
</es:envelope>

If the format of the source data does not match the format of the envelope, the source data is serialized and stored in the envelope as a string. For example, if the source data is JSON and the envelope value is XML, then the source is stored as the text value of an es:attachments XML element. The following snippet is from an XML envelope document created from JSON source:

<es:envelope xmlns:es="http://marklogic.com/entity-services">
  <es:instance>...</es:instance>
  <es:attachments>{"pid":2345, "given":"Martha", "family":"Washington"}</es:attachments>
</es:envelope>

The following code extracts the raw source attachment from an envelope document, assuming it is the only attachment.

Language Example
XQuery
xquery version "1.0-ml";
import module namespace es = "http://marklogic.com/entity-services"
  at "/MarkLogic/entity-services/entity-services.xqy";

es:instance-get-attachments(fn:doc('/es-gs/env/1234.xml'))[1]
JavaScript
'use strict';
const es = require('/MarkLogic/entity-services/entity-services.xqy');

fn.head(
  es.instanceGetAttachments(cts.doc('/es-gs/env/2345.xml'))
);

If there are multiple children in the //es:attachments element, you are responsible for picking out the raw source from the other attachments. There will only be multiple attachments if you explicitly add extra attachments.

If the original source attachment and the envelope format do not match, you must convert the serialization if you want to work with the data in its original form. For example, the following code deserializes a serialized JSON attachment from an XML envelope document, and then accesses one of its properties.

Language JSON Deserialization Example
XQuery
xquery version "1.0-ml";
import module namespace es = "http://marklogic.com/entity-services"
  at "/MarkLogic/entity-services/entity-services.xqy";

map:get(
  xdmp:from-json-string(
    es:instance-get-attachments(fn:doc('/es-gs/env/2345.xml'))[1]
  )[1], "pid"
)
Server-Side JavaScript
'use strict';
const es = require('/MarkLogic/entity-services/entity-services.xqy');

fn.head(xdmp.fromJsonString(
  fn.head(
    es.instanceGetAttachments(cts.doc('/es-gs/env/2345.xml')))
)).pid;

The following code is a similar example that extracts an XML attachment from a JSON envelope:

Language XML Deserialization Example
XQuery
xquery version "1.0-ml";
import module namespace es = "http://marklogic.com/entity-services"
  at "/MarkLogic/entity-services/entity-services.xqy";

xdmp:unquote(
  es:instance-get-attachments(fn:doc('/es-gs/env/1234.json'))[1]
)[1]//pid/data()
Server-Side JavaScript
'use strict';
const es = require('/MarkLogic/entity-services/entity-services.xqy');

fn.head( xdmp.unquote(
  fn.head(es.instanceGetAttachments(cts.doc('/es-gs/env/1234.json')))
)).xpath('//pid/data()')

Updating Entity Instance Data When Your Model Changes

As your model changes, you might need to update your instance data to match. Model changes can also impact generated and configuration artifacts. For details, see Managing Model Changes.

« Previous chapter
Next chapter »
Powered by MarkLogic Server | Terms of Use | Privacy Policy