Loading TOC...
Java Application Developer's Guide (PDF)

Java Application Developer's Guide — Chapter 2

Single Document Operations

This chapter describes how to create, delete, update, and read a single document content and/or its metadata using the Java Client API. The Java Client API also enables you to work with multiple documents in a single request, as described in Synchronous Multi-Document Operations and Asynchronous Multi-Document Operations.

When working with documents, it is important to keep in mind the difference between a document on your client and a document in the database. In particular, any changes you make to a document's content and metadata on the client do not persist between sessions. Only if you write the document out to the database do your changes persist.

This chapter includes the following sections:

Document Creation

Document creation is not done via a document creation method. When you first write content via a Manager object to a document in the database as identified by its URI, MarkLogic Server creates a document in the database with that URI and content.

To call write(), an application must authenticate as a user with at least one of the rest-writer or rest-admin roles (or as a user with the admin role).

This section describes the following about document creation operations:

Writing an XML Document To The Database

Note that no changes you make to a document or its metadata persist until you write the document out to the database. Within your application, you are only manipulating it within system memory, and those changes will vanish when the application ends. The database content is constant until and unless a write or delete operation changes it.

The basic steps needed to write a document are:

  1. If you have not already done so, connect to the database, storing the connection in a com.marklogic.client.DatabaseClient object. For example, if using digest authentication:
    DatabaseClient client = DatabaseClientFactory.newClient(
        host, port, new DigestAuthContext(username, password));
  2. If you have not already done so, use the DatabaseClient object to create a com.marklogic.client.document.DocumentManager object of the appropriate subclass for the document content you want to access (XML, text, JSON, binary, generic). In this example code, an XMLDocumentManager.
    XMLDocumentManager docMgr = client.newXMLDocumentManager();
  3. Get the document's content. For example, by using an InputStream.
    FileInputStream docStream = new FileInputStream(
                                    "data"+File.separator+filename);
  4. Create a handle associated with the input stream to receive the document's content. How you get content determines which handle you use. Use the handle's set() method to associate it with the desired stream.
    InputStreamHandle handle = new InputStreamHandle(docStream);
  5. Write the document's content by calling a write() method on the DocumentManager, with arguments of the document's URI and the handle.
    docMgr.write(docId, handle);
  6. When finished with the database, release the connection resources by calling the DatabaseClient object's release() method.
    client.release();

Creating a Text Document In the Database

This procedure outlines a very basic creation operation for a simple text document is as follows:

  1. Create a com.marklogic.client.DatabaseClient for the database. For example, if using digest authentication:
    DatabaseClient client = DatabaseClientFactory.newClient(
        host, port, new DigestAuthContext(username, password));
  2. Create a com.marklogic.client.document.DocumentManager object of the appropriate format for your document; text, binary, JSON, XML, or generic if you are not sure.
    TextDocumentManager docMgr = client.newTextDocumentManager();
  3. For convenience's sake, set a variable to your new document's URI. This is not required; the raw string could be used wherever docId is used.
    String docId = "/example/text.txt";
  4. As discussed previously in Using Handles for Input and Output, within MarkLogic Java applications you use handle objects to contain a document's content and metadata. Since this is a text document, we will use a com.marklogic.client.io.StringHandle to contain the text content. After creation, set the handle's value to the document's initial content.
    StringHandle handle = new StringHandle();
    handle.set("A simple text document");
  5. Write the document content out to the database. This creates the document in the database if it is not already there (if it is already there, it updates the content to whatever is in the handle argument). The identifier for the document is the value of the docId argument.
    docMgr.write(docId, handle);
  6. When finished with the database, release the connection resources by calling the DatabaseClient object's release() method.
    client.release();

Automatically Generating Document URIs

MarkLogic Server can automatically generate database URIs for documents inserted using the Java API. You can only use this feature to create new documents. To update an existing document, you must know the URI.

To insert a document with a generated URI, use a com.marklogic.client.document.DocumentUriTemplate with DocumentManager.create(), as described by the following procedure.

  1. If you have not already done so, connect to the database, storing the connection in a com.marklogic.client.DatabaseClient object. For example, if using digest authentication:
    DatabaseClient client = DatabaseClientFactory.newClient(
        host, port, new DigestAuthContext(username, password));
  2. If you have not already done so, use the DatabaseClient object to create a com.marklogic.client.document.DocumentManager object of the appropriate subclass for the document content you want to access (XML, text, JSON, binary, generic). In this example code, an XMLDocumentManager.
    XMLDocumentManager docMgr = client.newXMLDocumentManager();
  3. Create a DocumentUriTemplate using the document manager. Specify the extension suffix for the URIs created with this template. Do not include a "." separator. The following example creates a template that generates URIs ending with ".xml".
    DocumentUriTemplate template = docMgr.newDocumentUriTemplate("xml");
  4. Optionally, specify additional URI template attributes, such as a database directory prefix and document format. The following example specifies a directory prefix of "/my/docs/".
    template.setDirectory("/my/docs/");
  5. Get the document's content. For example, by using an InputStream.
    FileInputStream docStream = 
        new FileInputStream("data" + File.separator + filename);
  6. Create a handle associated with the input stream to receive the document's content. How you get content determines which handle you use. Use the handle's set() method to associate it with the desired stream.
    InputStreamHandle handle = new InputStreamHandle(docStream);
  7. Insert the document into the database by calling a create() method on the DocumentManager, passing in a URI template and the handle. Use the returned DocumentDescriptor to obtain the generated URI.
    DocumentDescriptor desc = docMgr.create(template, handle);
  8. When finished with the database, release the connection resources by calling the DatabaseClient object's release() method.
    client.release();

Format-Specific Write Capabilities

When inserting or updating a binary document, you can request metadata extraction using BinaryDocumentManager.setMetadataExtraction. For an example, see Writing A Binary Document.

When inserting or updating an XML document, you can request XML repair using XMLDocumentManager.setDocumentRepair.

See the Java Client API Documentation for details.

Document Deletion

To delete one or more documents, call DocumentManager.delete and pass in the URI(s) of the documents.

To delete documents, an application must authenticate as a user with at least one of the rest-writer or rest-admin roles (or as a user with the admin role).

The following example shows how to delete an XML document from the database.

  1. Create a com.marklogic.client.DatabaseClient for connecting to the database.For example, if using digest authentication:
    DatabaseClient client = DatabaseClientFactory.newClient(
        host, port, new DigestAuthContext(username, password));
  2. If you have not already done so, use the DatabaseClient object to create a com.marklogic.client.document.DocumentManager object of the appropriate subclass for the document format (XML, text, JSON, or binary).
    XMLDocumentManager docMgr = client.newXMLDocumentManager();
  3. Delete the document(s). For example, the following statement deletes 2 documents:
    docMgr.delete("/example/doc1.xml", "/example/doc2.json");
  4. When finished with the database, release the connection resources by calling the DatabaseClient object's release() method.
    client.release();

Reading Document Content

Reading requires a handle to access document content.

Note that no changes you make to a document or its metadata persist until you write the document out to the database. Within your application, you are only manipulating it on the client, and those changes will vanish when the application ends. The database content is persistent until and unless a write or delete operation changes it.

If you read content with a stream, you must close the stream when done. If you do not close the stream, HTTP clients do not know that you are finished and there are fewer connections available in the connection pool.

The basic steps to read a document from the database are:

  1. Create a com.marklogic.client.DatabaseClient for connecting to the database.For example, if using digest authentication:
    DatabaseClient client = DatabaseClientFactory.newClient(
        host, port, new DigestAuthContext(username, password));
  2. If you have not already done so, use the DatabaseClient object to create a com.marklogic.client.document.DocumentManager object of the appropriate subclass for the document format (XML, text, JSON, or binary).
    XMLDocumentManager docMgr = client.newXMLDocumentManager();
  3. Create a handle to receive the document's content. For information on handles and the wide variety of handle types, see Using Handles for Input and Output. This example uses a com.marklogic.client.io.DOMhandle object.
    DOMHandle handle = new DOMHandle();
  4. Read the document's content by calling a read() method on the DocumentManager, with arguments of the document's URI and the handle. Here, assume docId contains the document's URI.
    docMgr.read(docId, handle);
  5. Access the content by calling a get() method on the handle. For example, DomHandle.get returns a W3C Document object. There are many alternatives.
    Document document = handle.get();
  6. When finished with the database, release the connection resources by calling the DatabaseClient object's release() method.
    client.release();

Writing A Binary Document

To insert or update a binary document, use a handle containing your binary content with com.marklogic.client.document.BinaryDocumentManager. You can use any handle that implements BinaryWriteHandle, such as BytesHandle or FileHandle.

No metadata extraction is performed by default. You can request metadata extraction and specify how it is saved by calling BinaryDocumentManager.setMetadataExtraction().

The following example reads a JPEG image from a file named my.png and inserts it into the database as a binary document with URI /images/my.png. During insertion, metadata is extracted from the binary content and saved as document properties.

String docId = "/example/my.png";
String mimetype = "image/png";

BinaryDocumentManager docMgr = client.newBinaryDocumentManager();
docMgr.setMetadataExtraction(MetadataExtraction.PROPERTIES);

docMgr.write(
    docId, 
    new FileHandle().with(new File("my.png")).withMimetype(mimetype)
  );

Reading Content From A Binary Document

There are several ways to read content from a binary document.

To stream binary content, use InputStream as follows:

InputStream byteStream = 
    docMgr.read(docID, new InputStreamHandle()).get();

To buffer the binary content, use com.marklogic.client.io.BytesHandle object as follows:

byte[] buf = docMgr.read(docID, new BytesHandle()).get();

Or you can read only part of the content:

BytesHandle handle = new BytesHandle();
buf = docMgr.read(docId, handle, 9, 10).get();

Reading, Modifying, and Writing Metadata

Reading and writing document metadata from and to the database are very similar operations to reading and writing document content. Each requires calling methods on com.marklogic.client.document.DocumentManager. The handle for metadata can be a DocumentMetadataHandle to modify metadata in a POJO, or it can be raw XML or JSON.

You can perform operations on the metadata associated with documents such as collections, permissions, properties, and quality. This section describes those metadata operations and includes the following parts:

Document Metadata

The enum DocumentManager.Metadata enumerates the metadata categories (including ALL). The following are the metadata types covered by this enumeration:

  • COLLECTIONS: Document collections, a non-hierarchical way of organizing documents in the database. For details, see Collections Metadata.
  • METADATAVALUES: Key-value metadata, sometimes called 'metadata fields'. For details, see Values Metadata.
  • PERMISSIONS: Document permissions. For details, see Permissions Metadata.
  • PROPERTIES: Document properties. Property-value pairs associated with the document. For details, see Properties Metadata.
  • QUALITY: Document search quality. Helps determine which documents are of the best quality. For details, see Quality Metadata.

These metadata types are described in detail later in this chapter.

Reading Document Metadata

The basic steps needed to read a document's metadata are:

  1. If you have not already done so, create a com.marklogic.client.DatabaseClient for connecting to the database. For example, if using digest authentication:
    DatabaseClient client = DatabaseClientFactory.newClient(
        host, port, new DigestAuthContext(username, password));
  2. If you have not already done so, use the DatabaseClient object to create a com.marklogic.client.document.DocumentManager object of the appropriate subclass for the document format (XML, text, JSON, or binary).
    XMLDocumentManager docMgr = client.newXMLDocumentManager();
  3. Create a com.marklogic.client.io.DocumentMetadataHandle object, which will receive the document's metadata. Alternately, you can create raw XML or JSON.
    DocumentMetadataHandle metadataHandle = new DocumentMetadataHandle();
  4. If you also want to get the document's content, create a handle to receive it. Note that you need separate handles for a document's content and metadata.
    DOMHandle docHandle = new DOMHandle();
  5. Read the document's metadata by calling a readMetadata() method on the DocumentManager, with an argument of the metadata handle. Note that you can also call read() with an additional argument of a content handle so that it will read the metadata into the metadata handle and the content into the content handle in a single operation. To call read(), an application must authenticate as rest-reader, rest-writer, or rest-admin. Below, docId is a variable containing a document URI.
    //read only the metadata into a handle
    docMgr.readMetadata(docId, metadataHandle); 
    //read metadata and content
    docMgr.read(docId, metadataHandle, docHandle); 
  6. Access the metadata by calling get() methods on the metadata handle. Later sections in this chapter show how to access the other types of metadata.
    DocumentCollections collections = metadataHandle.getCollections();
    Document document = contentHandle.get();
  7. When finished with the database, release the connection resources by calling the DatabaseClient object's release() method.
    client.release();

By default, DocumentManager reads and writes all categories of metadata. To read or write a subset of the metadata categories, configure DocumentManager by calling setMetadataCategories(). For example, to retrieve just collection metadata, make the following call before calling DocumentManager.read or DocumentManager.readMetadata:

docMgr.setMetadataCategories(DocumentManager.Metadata.COLLECTIONS);

Collections Metadata

Collections are a way to organize documents in a database. A collection defines a set of documents in the database. You can set documents to be in any number of collections either at the time the document is created or by updating a document. Searches against collections are both efficient and convenient. For more details on collections, see Collections in the Search Developer's Guide.

The Java API allows you to read and manipulate collections metadata using the com.marklogic.client.io.DocumentMetadataHandle.DocumentCollections. Collections are named by specifying a URI. A collection URI serves as an identifier, and it can be any valid URI.

The code in this section assumes a DocumentManager object of an appropriate type for the document, docMgr, and a string containing a document URI, docId, have been created.

To get all collections for a document and put them in an array, do the following:

//Get the set of collections the document belongs to and put in array.
DocumentCollections collections = metadataHandle.getCollections();

To check if a collection URI exists in a document's set of collections, do the following:

collections.contains("/collection_name/collection_name2");

To add a document to one or more collections, do the following:

collections.addAll("/shakespeare/sonnets", "/shakespeare/plays");

To remove a document from a collection, do the following:

collections.remove("/shakespeare/sonnets");

To remove a document from all its collections, do the following:

collections.clear();

Values Metadata

The METADATAVALUES metadata category represents simple key-value metadata for a document. Both the key and the value are strings. You can define your own key-value pairs. MarkLogic also adds key-value pairs to this type of metadata to documents in certain situations, such as when you work with temporal documents.

MarkLogic stores values metadata separately from its associated document. To search values metadata, define a metadata field and use a field query. For more details, see Metadata Fields in the Administrator's Guide.

To access metadata values you've read from the database, use DocumentMetadataHandle.getMetadataValues. For example, if you read the metadata from a document using a call sequence similar to the following:

DocumentMetadataHandle metadataHandle = new DocumentMetadataHandle();
docMgr.setMetadataCategories(METADATAVALUES);
docMgr.readMetadata(docId, metadataHandle);

Then you can access the returned values metadata as follows:

DocumentMetadataValue mvMap = metadataHandle.getMetadataValues();
String someValue = mvMap.get("someKey");

DocumentMetadataValue is an extension of java.util.Map, so you can use the Map methods to explore the returned metadata.

To add a new key-value pair or change the value of an existing pair, in a document's metadata, use DocumentMetadataValue.put or DocumentMetadataHandle.withMetadataValue. For example, the following adds a key-value pair with key 'myKey' and value 'myValue':

mvMap.put("myKey", "myValue");
//or
metadataHandle.withMetadataValue("myKey", "myValue");

Once you initialize your map or handle with values metadata, write the new metadata to the database as described in Writing Metadata.

Properties Metadata

Manipulate properties metadata using the com.marklogic.client.io.DocumentMetadataHandle.DocumentProperties class.

The code in this section assumes a DocumentManager object, docMgr, and a string containing a document's URI, docId, have been created.

To get all of a document's properties metadata, do the following:

DocumentProperties properties = metadataHandle.getProperties();

DocumentProperties objects represent a document's properties as a map.

To check if a document's properties contain a specific property name, do the following:

exists = properties.containsKey("name");

To get a specific property's value do the following:

value = metadataHandle.getProperties("name");

To add a new property or change the value of an existing property in a document's metadata, build up the new set of properties using DocumentProperties.put or DocumentMetadataHandle.withProperty, and then write the new metadata to the database as described in Writing Metadata. For example, the following adds a property named 'name' with the value 'value'.

metadataHandle.getProperties().put("name", "value");

Quality Metadata

The code in this section assumes a com.marklogic.client.io.DocumentManager object, docMgr, and a string containing a document's URI, docId, have been created.

The quality metadata affects the ranking of documents for use in searches by creating a multiplier for calculating the score for that document, and the default value for quality in the Java API is 0.

To get a document's search quality metadata value do the following:

int quality = metadataHandle.getQuality();

To set a document's search quality value do the following:

metadataHandle.setQuality(3);

Permissions Metadata

Permissions on documents control who can access a document for the capabilities of read, update, insert, and execute. To perform one of these operations on a document, a user must have a role corresponding to the permission for each capability needed. For details on permissions and on the security model in MarkLogic Server, see the Security Guide.

The code in this section assumes a DocumentManager object, docMgr, and a string containing a document's URI, docId, have been created. Manipulate document properties using the class com.marklogic.client.io.DocumentMetadataHandle.DocumentPermissions.

MarkLogic Server defines permissions using roles and capabilities.

The allowed values for capabilities are those in the enum com.marklogic.client.io.DocumentMetadataHandle.Capability:

  • EXECUTE - Permission to execute the document.
  • INSERT - Permission to create but not modify or delete the document.
  • READ - Permission to read the document but not modify it..
  • UPDATE - Permission to create, modify, or delete the document, but not to read it.

Roles are assigned to users via the Admin Interface or through other administrative tools, and cannot be assigned via the Java API. You can, however, control permissions on documents as part of their metadata.

To get permissions metadata for a document, do the following:

DocumentPermissions permissions = metadataHandle.getPermissions()

metadataHandle.getPermissions().add("app-user", 
   Capability.UPDATE, Capability.READ);

Manipulating Document Metadata In Your Application

A DocumentMetadataHandle represents metadata as a POJO. A DocumentMetadataHandle has several methods for manipulating a document's metadata. That may not be how you want to work with the metadata, however. If you would prefer to work with it as XML, then read it with an XML handle. If you would prefer to work with it as JSON, read it with a JSON handle. A StringHandle can use either XML or JSON, defaulting to XML.

To specify the format for reading content, use withFormat() or setFormat(), as in the following example:

StringHandle metadataHandle = 
    new StringHandle().withFormat(Format.JSON);

Working with Temporal Documents

Most document write operations on JSON and XML documents enable you to work with temporal documents. Temporal-aware document inserts and updates are made available through the com.marklogic.client.document.TemporalDocumentManager interface. JSONDocumentManager and XMLDocumentManager implement TemporalDocumentManager.

The TemporalDocumentManager interface exposes methods for creating, updating, patching, and deleting documents that accept temporal related parameters such as the following:

  • temporalCollection: The URI of the temporal collection into which the new document should be inserted, or the name of the temporal collection that contains the document being updated.
  • temporalDocumentURI: The 'logical' URI of the document in the temporal collection; the temporal collection document URI. This is equivalent to the first parameter of the temporal:statement-set-document-version-uri XQuery function or of the temporal.statementSetDocumentVersionUri Server-Side JavaScript function.
  • sourceDocumentURI: The temporal collection document URI of the document being operated on. Only applicable when updating existing documents. This parameter facilitates working with documents with user-maintained version URIs.
  • systemTime: The system start time for an update or insert.

During an update operation, if you do not specify sourceDocumentURI or temporalDocumentURI parameters, then the uri parameter indicates the source document. If you specify temporalDocumentURI, but do not specify sourceDocumentURI, then the temporalDocumentURI identifies the source document.

The uri parameter always refers to the output document URI. When the MarkLogic manages the version URIs, the document URI and temporal document collection URI have the same value. When the user manages version URIs, they can be different.

The TemporalDocumentManager.protect method enable you to protect a temporal document from operations such as update, delete, and wipe for a specified period of time. This method is equivalent to calling the temporal:document-protect XQuery function or the temporal.documentProtect Server-Side JavaScript function.

For more details, see the Temporal Developer's Guide and the JavaDoc in the Java Client API Documentation.

Writing Metadata

When you are finished modifying metadata categories, you must write it to the database to persist it. Note that the above operations all only change the document's metadata stored on the client, and do not change the metadata for document in the database. To write the metadata changes to the database, as well as the document content, do the following:

InputStreamHandle handle = new InputStreamHandle(docStream);
docMgr.write(docId, metadataHandle, handle);

Conversion of Document Encoding

The Java API handles encoding conversions for you, but you have to:

  • know the encoding
  • use the appropriate handle

If you specify the encoding and it turns out to be the wrong encoding, then the conversion will likely not turn out as you expect.

MarkLogic Server stores text, XML, and JSON as UTF-8. In Java, characters in memory and reading streams are UTF-16. The Java API converts characters to and from UTF-8 automatically.

When writing documents to the server, you need to know if they are already UTF-8 encoded. If a document is not UTF-8, you must specify its encoding or you are likely to end up with data that has incorrect characters due to the incorrect encoding. If you specify a non-UTF-8 encoding, the Java API will automatically convert the encoding to UTF-8 when writing to MarkLogic.

When writing characters to or reading characters from a file, Java defaults to the platform's standard encoding. For example, there is different platform encoding on Linux than Windows.

XML supports multiple encodings as defined by the header (called an XML declaration):

<?xml version="1.0" encoding ="utf-8">

The XML declaration declares a file's encoding. XML parsing tools, including handles, can determing encoding from this and do the conversion for you.

When writing character data to the database, you need to pick an appropriate handle type, depending on your intent and circumstances.

Depending on your application, you may need to be aware that MarkLogic Server normalizes text to precomposed Unicode characters for efficiency. Unicode abstract characters can either be precomposed (one character) or decomposed (two characters). If you write a decomposed Unicode document to MarkLogic Server and then read it back, you will get back precomposed Unicode. Usually, you do not need to care if characters are precomposed or decomposed. This Unicode issue only affects some characters, and many APIs abstract away the difference. For instance, the Java collator treats the precomposed and decomposed forms of a character as the same character. If your application needs to compensate for this difference, you can use java.text.Normalizer; for details, see:

http://docs.oracle.com/javase/6/docs/api/java/text/Normalizer.html

The following table describes possible cases for reading character data with recommended handles to use in each case.

Read ConditionRecommended Handle(s)
If reading binary data:Use BytesHandle, FileHandle, or InputStreamHandle.
If reading character data from the database:BytesHandle, FileHandle, InputStreamHandle, and the XML handles are encoded as UTF-8. StringHandle and ReaderHandle convert to UTF-16.

The following table describes possible cases for writing character data with recommended handles to use in each case.

Write ConditionRecommended Handle(s)
If the data you are writing is a Java string:Use StringHandle; it converts on write from UTF-16 to UTF-8.
If writing binary data:Use BytesHandle, FileHandle, or InputStreamHandle.
If the data you are writing is encoded as UTF-8 and you do not need to modify the data: Use BytesHandle, FileHandle, or InputStreamHandle.
If it is XML that declares an encoding other than UTF-8 in the XML declaration and you do not need to modify the data: Use InputSourceHandle, XMLEventReaderHandle, or XMLStreamReaderHandle; these convert to UTF-8.
If the character data to write is XML that declares the encoding in a prolog and you need to modify the data:Use DOMHandle, SourceHandle, or create a handle class on an open source DOM. For examples of the latter, see JDOMHandle, XOMHandle, or DOM4JHandle in the package com.marklogic.client.extra. All these classes convert to UTF-8.
If the character data to write has a known encoding other than UTF-8 and you don't need to modify the data:Use ReaderHandle and specify the encoding when creating the Reader (as usual in Java); these convert to UTF-8.
If the character data to write is XML with a known but undeclared encoding and you need to modify the data:

Use DOMHandle with a DocumentBuilder parsing an InputSource with a specified encoding as in:

DOMHandle handle = new DOMHandle();
handle.set(
 handle.getFactory().newDocumentBuilder()
 parse(newInputSource(...reader
         specifying charset ...)));

or Use SourceHandle with a StreamReader on a Reader with a specified encoding as in:

SourceHandle handle = new SourceHandle();
handle.set(new StreamSource(...
           reader specifying charset
          ...));
If the character data to write is JSON and you need to modify the data:Consider using a JSON library such as Jackson or GSON. See com.marklogic.client.extra.JacksonHandle for an example.
If the character data to write is text other than JSON or XML and you need to modify the data:Consider using a StreamTokenizer with a Reader, or Pattern with a String

Partially Updating Document Content and Metadata

The interface com.marklogic.client.document.DocumentPatchBuilder enables you to update a portion of an existing document or its metadata. This section covers the following topics:

Introduction to Content and Metadata Patching

A partial update is an update you apply to a portion of a document or metadata, rather than replacing an entire document or all of the metadata. For example, inserting an XML element or attribute or changing the value associated with a JSON property. You can only apply partial content updates to XML and JSON documents. You can apply partial metadata updates to any document type.

Use a partial update to do the following operations:

  • Add, replace, or delete an XML element, XML attribute, or JSON object or array item of an existing document.
  • Add, replace, or delete a subset of the metadata of an existing document. For example, modify a permission or insert a property.
  • Dynamically generate replacement content or metadata on MarkLogic Server using builtin or user-defined functions. For details, see Construct Replacement Data on the Server.

You can apply multiple updates in a single patch, and you can update both content and metadata in the same patch.

A patch is a partial update descriptor, expressed in XML or JSON, that tells MarkLogic Server where to apply an update and what update to apply. Four operations are available in a patch: insert, replace, replace-insert, and delete. (A replace-insert operation functions as a replace, as long as at least one match exists for the target content; if there are no matches, then the operation functions as an insert.)

Patch operations can target XML elements and attributes, JSON property values and array items, and data values. You identify the target of an operation using XPath and JSONPath expressions. When inserting new content or metadata, the insertion point is further defined by specifying the position; for details, see How Position Affects the Insertion Point in the REST Application Developer's Guide. Note that you cannot patch unnamed JSON entities; for details, see Limitations of JSON Path Expressions in the REST Application Developer's Guide.

When applying a patch to document content, the patch format must match the document format: An XML patch for an XML document, a JSON patch for a JSON document. You cannot patch the content of other document types. You can patch metadata for all document types. A metadata-only patch can be in either XML or JSON. A patch that modifies both content and metadata must match the document content type.

You can construct a patch from raw JSON or XML, or using one of the following builder interfaces:

  • com.marklogic.client.document.DocumentPatchBuilder
  • com.marklogic.client.document.DocumentMetadataPatchBuilder

The patch builder interface contains value and fragment oriented methods, such as replaceValue and replaceFragment. You can use the *Value methods when the new value is an atomic value, such as a string, number, or boolean. Use the *Fragment methods when the new value is a complex structure, such as an XML element or JSON object or array.

Apply a patch by passing a handle to it to the patch() method of a DocumentManager. The following example sketches construction of a patch using a builder, and then applying the patch to an XML document. The patch inserts a <child/> element as the last child element of the node addressed by the XPath expression /data.

DocumentPatchBuilder xmlPatchBldr = xmlDocMgr.newPatchBuilder();
DocumentPatchHandle xmlPatch = 
    xmlPatchBldr.insertFragment(
        "/data", 
        Position.LAST_CHILD,
        "<child>the last one</child>")
      .build();
xmlDocMgr.patch(docId, xmlPatch);

For detailed instructions, see Basic Steps for Patching Documents and Metadata.

If a patch contains multiple operations, they are applied independently to the target document. That is, within the same patch, one operation does not affect the context path or select path results or the content changes of another. Each operation in a patch is applied independently to every matched node. If any operation in a patch fails with an error, the entire patch fails.

Content transformations are not directly supported in a partial update. However, you can implement a custom replacement content generation function to achieve the same effect. For details, see Construct Replacement Data on the Server.

Basic Steps for Patching Documents and Metadata

This section describes how to create a patch builder, use it to construct a patch descriptor, and then apply the patch. To construct a patch without using a builder, see Construct a Patch From Raw XML or JSON.

For JSON and XML documents, you can use a com.marklogic.client.document.DocumentPatchBuilder to patch content only, content plus metadata, or metadata only. For all document types, you can use a com.marklogic.client.document.DocumentMetadataPatchBuilder to patch metadata only. A DocumentPatchBuilder is also a DocumentMetadataPatchBuilder. Use a DocumentManager subclass such as JSONDocumentManager or GenericDocumentManager to create a patch builder.

When you combine content and metadata updates in the same patch, the patch format (XML or JSON) must match the content type of the patched documents.

Follow this procedure to use a builder to create and apply a patch to the contents of an XML or JSON document, or to the metadata of any type of document.

  1. If you have not already done so, connect to the database, storing the connection in a com.marklogic.client.DatabaseClient object. For example, if using digest authentication:
    DatabaseClient client = DatabaseClientFactory.newClient(
        host, port, new DigestAuthContext(username, password));
  2. If you have not already done so, use the DatabaseClient object to create a com.marklogic.client.document.DocumentManager object of the appropriate subclass for the document content you want to access (XML, JSON, binary, or text). In this example code, an XMLDocumentManager.
    XMLDocumentManager docMgr = client.newXMLDocumentManager();

    You can only apply content patches to XML and JSON documents.

  3. Create a document patch builder or metadata patch builder using the document manager. For example:
    DocumentPatchBuilder builder = docMgr.newPatchBuilder();
  4. Call the patch builder methods to define insert, replace, replace-insert, and delete operations for the patch. The following example adds an element insertion operation:
    builder.insertFragment("/data", Position.LAST_CHILD,
        "<child>the last one</child>");

    For more details on identifying the target content for an operation, see Defining the Context for a Patch Operation.

  5. Create a handle associated with the patch using DocumentPatchBuilder.build(). For example:
    DocumentPatchHandle handle = builder.build();

    Once you call build(), the patch contents are fixed. Subsequent calls to define additional operation, such as calling insertFragment again, will have no effect.

  6. Apply the patch by calling a patch() method on the DocumentManager, with arguments of the document's URI and the handle. For example:
    docMgr.patch(docId, handle);
  7. When finished with the database, release the connection resources by calling the DatabaseClient object's release() method. For example:
    client.release();

Construct a Patch From Raw XML or JSON

This section describes how to create and apply a patch that you construct directly using XML or JSON. To construct a patch using a Java builder, see Basic Steps for Patching Documents and Metadata.

When you construct a patch that modifies both content and metadata, the patch format must match the content type of the target XML or JSON document. When you construct a patch that only modifies metadata, the patch format can use either XML or JSON, and the patch can be applied to the metadata of any type of document (XML, JSON, text, or binary).

For examples of raw patches, see XML Examples of Partial Updates or JSON Examples of Partial Update in the REST Application Developer's Guide:

Follow this procedure to create and apply a raw XML or JSON patch to the contents of an XML or JSON document, or to the metadata of any type of document.

  1. Create a JSON or XML representation of the patch operations, using the tools or library of your choice. For syntax, see XML Patch Reference and JSON Patch Reference and in the REST Application Developer's Guide. The following example uses a String representation of a patch that inserts an element in an XML document:
    String xmlPatch = 
        "<rapi:patch xmlns:rapi='http://marklogic.com/rest-api'>" +
          "<rapi:insert context='/data' position='last-child'>" +
            "<child>the last one</child>" +
          "</rapi:insert>" +
        "</rapi:patch>";
  2. If you have not already done so, connect to the database, storing the connection in a com.marklogic.client.DatabaseClient object. For example, if using digest authentication:
    DatabaseClient client = DatabaseClientFactory.newClient(
        host, port, new DigestAuthContext(username, password));
  3. If you have not already done so, use the DatabaseClient object to create a com.marklogic.client.document.DocumentManager object of the appropriate subclass for the document content you want to access (XML, JSON, binary, or text). In this example code, an XMLDocumentManager.
    XMLDocumentManager docMgr = client.newXMLDocumentManager();

    You can only apply content patches to XML and JSON documents.

  4. Create a handle that implements DocumentPatchHandle and associate your patch with the handle. Set the handle content type appropriately. For example:
    // For an XML patch
    DocumentPatchHandle handle = 
        new StringHandle(xmlPatch).withFormat(Format.XML);
    
    // For a JSON patch
    DocumentPatchHandle handle = 
        new StringHandle(jsonPatch).withFormat(Format.JSON);
  5. Apply the patch by calling a patch() method on the DocumentManager, with arguments of the document's URI and the handle.
    docMgr.patch(docId, handle);
  6. When finished with the database, release the connection resources by calling the DatabaseClient object's release() method.
    client.release();

Defining the Context for a Patch Operation

When you insert, replace, or delete content or metadata, the patch definition must include enough context to tell MarkLogic Server what XML or JSON components to operate on. For example, which XML element or JSON property to modify, where to insert a new element or object, or which element, object, or value to replace.

When you create a patch using a builder, you specify the context through the contextPath and selectPath parameters of builder methods such as DocumentPatchBuilder.insertFragment() or DocumentPatchBuilder.replaceValue(). When you create a patch from raw XML or JSON, you specify the operation context through the context and select XML attributes or JSON properties.

For XML documents, you specify the context using an XPath (XML) expression.The XPath you can use is limited to the subset that can be used to define a path range index. For details, see Path Expressions Usable in Index Definitions in the REST Application Developer's Guide.

For JSON documents, use JSONPath (JSON). The JSONPath you can use has the same limitation as those that apply to XPath. For details, see Introduction to JSONPath and Path Expressions Usable in Index Definitions in the REST Application Developer's Guide.

Example: Replacing Parts of a JSON Document

This example uses patch operations to perform the document transformation shown in the table below. The patch eplaces one JSON property with another, replaces the simple value of a property, and replaces the array value of a property.

Before UpdateAfter Update
{ "parent": {
    "child1": {
      "grandchild": "value"
    },
    "child2": "simple",
    "child3": [ "av1", "av2" ]
} }
{ "parent": {
    "child1": {
      "REPLACE1": "REPLACED1"
    },
    "child2": "REPLACED2",
    "child3": [
      "REPLACED3a",
      "REPLACED3b"
    ]
} }

The raw patch that applies these changes is shown below.

{ "patch": [
    { "replace": {
        "select": "/parent/child1",
        "content": { "REPLACE1": "REPLACED1" }
    }},
    { "replace": {
        "select": "/parent/child2",
        "content": "REPLACED2"
    }},
    { "replace": {
        "select": "/parent/array-node('child3')",
        "content": [ "REPLACED3a", "REPLACED3b" ]
    }}
]}

The following code demonstrates how to use the PatchBuilder interface to create the equivalent raw patch. A Jackson ObjectMapper is used to construct the complex replacement values (the object value of child1 and the array value of child3).

JSONDocumentManager jdm = client.newJSONDocumentManager();
DocumentPatchBuilder pb = jdm.newPatchBuilder();
pb.pathLanguage(DocumentPatchBuilder.PathLanguage.XPATH);
ObjectMapper mapper = new ObjectMapper();

pb.replaceFragment("/parent/child1", 
        mapper.createObjectNode().put("REPLACE1", "REPLACED1"));
pb.replaceValue("child2", "REPLACED2");
pb.replaceFragment("/parent/array-node('child3')", 
        mapper.createArrayNode().add("REPLACED3a").add("REPLACED3b"));
jdm.patch(URI, pb.build());

Example: Patching Metadata

This example demonstrates using a patch builder to modify metadata such as collections, permissions, quality, document properties, and key-value metadata.

Assume a document exists in the database with the following metadata. The document is in one collection, has no document properties or key-value metadata, has default permissions, and has quality 2.

<rapi:metadata uri="/java/doc.json"
    xsi:schemaLocation="http://marklogic.com/rest-api restapi.xsd"
    xmlns:rapi="http://marklogic.com/rest-api"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <rapi:collections>
    <rapi:collection>original</rapi:collection>
  </rapi:collections>
  <rapi:permissions>
    <rapi:permission>
      <rapi:role-name>rest-writer</rapi:role-name>
      <rapi:capability>update</rapi:capability>
    </rapi:permission>
    <rapi:permission>
      <rapi:role-name>rest-reader</rapi:role-name>
      <rapi:capability>read</rapi:capability>
    </rapi:permission>
  </rapi:permissions>
  <prop:properties xmlns:prop="http://marklogic.com/xdmp/property"/>
  <rapi:quality>2</rapi:quality>
  <rapi:metadata-values/>
</rapi:metadata>

The example modifies the metadata to do the following:

  • Add the document to another collection.
  • Set the quality to 3.
  • Add some key-value metadata.
  • Add a new role to the permissions

The following code builds and applies the patch using a GenericDocumentManager and DocumentMetadataPatchBuilder.

public static void metadataExample() {
    GenericDocumentManager gdm = client.newDocumentManager();
    DocumentMetadataPatchBuilder pb = gdm.newPatchBuilder(Format.XML);

    pb.addCollection("new");
    pb.setQuality(3);
    pb.addMetadataValue("newkey", "newvalue");
    pb.addPermission("newrole", 
                     DocumentMetadataHandle.Capability.READ,
                     DocumentMetadataHandle.Capability.UPDATE);

    gdm.patch(URI, pb.build());
}

After applying the patch, the document has the following metadata. The portion modified by the patch are shown in bold.

<rapi:metadata uri="/java/doc.json"
    xsi:schemaLocation="http://marklogic.com/rest-api restapi.xsd"
    xmlns:rapi="http://marklogic.com/rest-api"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <rapi:collections>
    <rapi:collection>original</rapi:collection>
    <rapi:collection>new</rapi:collection>
  </rapi:collections>
  <rapi:permissions>
    <rapi:permission>
      <rapi:role-name>rest-writer</rapi:role-name>
      <rapi:capability>update</rapi:capability>
    </rapi:permission>
    <rapi:permission>
      <rapi:role-name>newrole</rapi:role-name>
      <rapi:capability>update</rapi:capability>
      <rapi:capability>read</rapi:capability>
    </rapi:permission>
    <rapi:permission>
      <rapi:role-name>rest-reader</rapi:role-name>
      <rapi:capability>read</rapi:capability>
    </rapi:permission>
  </rapi:permissions>
  <prop:properties xmlns:prop="http://marklogic.com/xdmp/property"/>
  <rapi:quality>3</rapi:quality>
  <rapi:metadata-values>
    <rapi:metadata-value key="newkey">newvalue</rapi:metadata-value>
  </rapi:metadata-values>
</rapi:metadata>

You could also use a document type specific document manager to apply the patch. For example, you could use a JSONDocumentManager to create a DocumentPatchBuilder as shown below. The patch builder operations (pb.addCollection, etc.) do not change as a consequence.

JSONDocumentManager jdm = client.newJSONDocumentManager();
DocumentPatchBuilder pb = jdm.newPatchBuilder();
pb.pathLanguage(DocumentPatchBuilder.PathLanguage.XPATH);

// Construct and apply patch as previously shown

Managing XML Namespaces in a Patch

Namespaces potentially impact two parts of a patch operation:

  • The XPath expression(s) that define the context for an operation, such as which nodes to replace or where to insert new content.
  • New or replacement content.

Your patch must include definitions of any namespaces used in these contexts. The way you do so varies, depending on whether or not you use a builder to construct your patch. This section covers the following topics:

Defining Namespaces With a Builder

When you construct a patch with DocumentPatchBuilder, define any namespaces used in XPath context or select expressions by calling DocumentPatchBuilder.setNamespaces(). Such namespace definitions are patch-wide. That is, they apply to all operations in the patch.

Namespaces used in insertion or replacement content can either be patch-wide, as with XPath expressions, or defined inline on content elements.

The patch generated by the builder pre-defines the following namespace aliases for you:

  • xmlns:rapi="http://marklogic.com/rest-api"
  • xmlns:prop="http://marklogic.com/xdmp/property"
  • xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  • xmlns:xi="http://www.w3.org/2001/XMLSchema"

The following example defines three namespace aliases (r, t, and n) and uses them in defining the insertion context and the content to be inserted.

import com.marklogic.client.util.EditableNamespaceContext;
...
// construct a list of namespace definitions
EditableNamespaceContext namespaces = new EditableNamespaceContext();
namespaces.put("r", "http://root.org");
namespaces.put("t", "http://target.org");
namespaces.put("n", "http://new.org");

// add the namespace definitions to the patch
DocumentPatchBuilder builder = docMgr.newPatchBuilder();
builder.setNamespaces(namespaces);

// use the namespace aliases when definition operations
String newElem = "<n:new>";
builder.insertFragment(
    "/r:root/t:target", Position.LAST_CHILD, newElem);

You can also define the content namespace element n inline, as shown in the following example:

String newElem = "<n:new xmlns:n=\"http://new.org\">";
Defining Namespaces in Raw XML

When you construct a patch directly in XML, define any namespaces used in XPath context or select expressions on the root <patch/> element. Namespace definitions are patch-wide and apply to both XPath expressions and insertion or replacement content.

The <patch /> element must be defined in the namespace http://marklogic.com/rest-api. It is recommended that you use a namespace alias for this namespace so that element and attribute references in your patch that are not namespace qualified do not end up in the http://marklogic.com/rest-api namespace.

The following example defines four namespace aliases, one for the patch (rapi) and three content-specific aliases (r, n, and t). The content-specific aliases are used in defining the insertion context and the content to be inserted.

<rapi:patch xmlns:rapi="http://marklogic.com/rest-api"
    xmlns:r="http://root.org" xmlns:t="http://target.org"
    xmlns n="http://new.org">
  <rapi:insert context="/r:root/t:target" position="last-child">
    <n:new />
  </rapi:insert>
</rapi:patch>

For more details, see Managing XML Namespaces in a Patch in the REST Application Developer's Guide.

Construct Replacement Data on the Server

This section describes using builtin or user-defined XQuery replacement functions to generate the content for a partial update replace or replace-insert operation dynamically on MarkLogic Server.

The builtin functions support simple arithmetic and string manipulation. For example, you can use a builtin function to increment the current value of numeric data or concatenate strings. For more complex operations, create and install a user-defined function.

To create a user-defined replacement function, see Writing a User-Defined Replacement Constructor in the REST Application Developer's Guide. Install your implementation into the modules database associated with your REST Server; for details, see Managing Dependent Libraries and Other Assets.

To apply a builtin or user-defined server-side function to a patch operation when you create a patch with a patch builder, use a DocumentMetadataPatchBuilder.CallBuilder, obtained by calling DocumentMetadataPatchBuilder.call(). The builtin functions are exposed as methods of CallBuilder. The following example adds a replace operation to a patch that multiplies the current data value in child elements by 3.

DocumentPatchBuilder builder = docMgr.newPatchBuilder();
builder.replaceApply("child", builder.call().multiply(3));

To apply the same operation to a raw XML or JSON patch, use the apply XML attribute or JSON property of the operation. The following raw patches are equivalent to the patch produced by the above builder example. For details, see Constructing Replacement Data on the Server in the REST Application Developer's Guide.

XMLJSON
<rapi:patch
    xmlns:rapi="http://marklogic.com/rest-api">
  <rapi:replace 
    select="child"
    apply="ml.multiply">3</rapi:replace>
</rapi:patch>
{"patch": [
  {"replace": {
    "select": "child",
    "apply": "ml.multiply",
    "content": 3
  } }
] }

To apply a user-defined replacement function using a patch builder, first associate the module containing the function with the patch by calling DocumentPatchBuilder.library(), and then apply the function to an operation using one of the CallBuilder.applyLibrary* methods. The following example applies the function my-func in the module namespace http://my/ns, implemented in the XQuery library module installed in the modules database at /my.domain/my-lib.xqy.

DocumentPatchBuilder builder = docMgr.newPatchBuilder();

builder.library("http://my/ns", "/my.domain/my-lib.xqy");
builder.replaceApply("child", builder.call().applyLibrary("my-func");

When you construct a raw XML or JSON patch, associate the containing library module with the patch using the replace-library patch component, then apply the function to a replace or replace-insert operation using the apply XML attribute or JSON property. The following examples are equivalent to the above builder code. For more details, see Using a Replacement Constructor Function in the REST Application Developer's Guide.

XMLJSON
<rapi:patch
    xmlns:rapi="http://marklogic.com/rest-api">
  <rapi:replace-library
    at="/my.domain/my-lib.xqy" 
    ns="http://my/ns" />
  <rapi:replace select="child" apply="my-func"/>
</rapi:patch>
{"patch": [
  {"replace-library": {
    "at": "/my.domain/my-lib.xqy",
    "ns": "http://my/ns"
  } },
  {"replace": {
    "select": "child",
    "apply": "my-func"
  } }
] }

« Previous chapter
Next chapter »
Powered by MarkLogic Server 7.0-4.1 and rundmc | Terms of Use | Privacy Policy