Loading TOC...
Java Application Developer's Guide (PDF)

Java Application Developer's Guide — Chapter 2

Document Operations

This chapter describes how to create, delete, write to, and read document content and metadata using the Java API. It describes core methods such as read(), write(), delete(), and so on. These methods have many different signatures for use with more advanced operations such as transactions, transforms, and others. Specific method signatures for calling read(), etc. in these advanced contexts are discussed when a relevant operation is covered.

When working with documents, it is important to keep in mind the difference between a document on your client and a document in the database. In particular, any changes you make to a document's content and metadata on the client do not persist between sessions. Only if you write the document out to the database do your changes persist.

This chapter includes the following sections:

Document Creation

Document creation is not done via a document creation method. When you first write content via a Manager object to a document in the database as identified by its URI, MarkLogic Server creates a document in the database with that URI and content.

To call write(), an application must authenticate as a user with at least one of the rest-writer or rest-admin roles (or as a user with the admin role).

This section describes the following about document creation operations:

Writing an XML Document To The Database

Note that no changes you make to a document or its metadata persist until you write the document out to the database. Within your application, you are only manipulating it within system memory, and those changes will vanish when the application ends. The database content is constant until and unless a write or delete operation changes it.

The basic steps needed to write a document are:

  1. If you have not already done so, connect to the database, storing the connection in a com.marklogic.client.DatabaseClient object.
    DatabaseClient client = DatabaseClientFactory.newClient(
      host, port, user, password, authType);
  2. If you have not already done so, use the DatabaseClient object to create a com.marklogic.client.document.DocumentManager object of the appropriate subclass for the document content you want to access (XML, text, JSON, binary, generic). In this example code, an XMLDocumentManager.
    XMLDocumentManager docMgr = client.newXMLDocumentManager();
  3. Get the document's content. For example, by using an InputStream.
    FileInputStream docStream = new FileInputStream(
                                    "data"+File.separator+filename);
  4. Create a handle associated with the input stream to receive the document's content. How you get content determines which handle you use. Use the handle's set() method to associate it with the desired stream.
    InputStreamHandle handle = new InputStreamHandle(docStream);
  5. Write the document's content by calling a write() method on the DocumentManager, with arguments of the document's URI and the handle.
    docMgr.write(docId, handle);
  6. When finished with the database, release the connection resources by calling the DatabaseClient object's release() method.
    client.release();

Creating a Text Document In the Database

This procedure outlines a very basic creation operation for a simple text document is as follows:

  1. Create a com.marklogic.client.DatabaseClient for the database.
    DatabaseClient client = DatabaseClientFactory.newClient(
        host, port, user, password, authType);
  2. Create a com.marklogic.client.document.DocumentManager object of the appropriate format for your document; text, binary, JSON, XML, or generic if you are not sure.
    TextDocumentManager docMgr = client.newTextDocumentManager();
  3. For convenience's sake, set a variable to your new document's URI. This is not required; the raw string could be used wherever docId is used.
    String docId = "/example/text.txt";
  4. As discussed previously in Handles, within MarkLogic Java applications you use handle objects to contain a document's content and metadata. Since this is a text document, we will use a com.marklogic.client.io.StringHandle to contain the text content. After creation, set the handle's value to the document's initial content.
    StringHandle handle = new StringHandle();
    handle.set("A simple text document");
  5. Write the document content out to the database. This creates the document in the database if it is not already there (if it is already there, it updates the content to whatever is in the handle argument). The identifier for the document is the value of the docId argument.
    docMgr.write(docId, handle);
  6. When finished with the database, release the connection resources by calling the DatabaseClient object's release() method.
    client.release();

Automatically Generating Document URIs

MarkLogic Server can automatically generate database URIs for documents inserted using the Java API. You can only use this feature to create new documents. To update an existing document, you must know the URI.

To insert a document with a generated URI, use a com.marklogic.client.document.DocumentUriTemplate with DocumentManager.create(), as described by the following procedure.

  1. If you have not already done so, connect to the database, storing the connection in a com.marklogic.client.DatabaseClient object.
    DatabaseClient client = DatabaseClientFactory.newClient(
      host, port, user, password, authType);
  2. If you have not already done so, use the DatabaseClient object to create a com.marklogic.client.document.DocumentManager object of the appropriate subclass for the document content you want to access (XML, text, JSON, binary, generic). In this example code, an XMLDocumentManager.
    XMLDocumentManager docMgr = client.newXMLDocumentManager();
  3. Create a DocumentUriTemplate using the document manager. Specify the extension suffix for the URIs created with this template. Do not include a "." separator. The following example creates a template that generates URIs ending with ".xml".
    DocumentUriTemplate template = docMgr.newDocumentUriTemplate("xml");
  4. Optionally, specify additional URI template attributes, such as a database directory prefix and document format. The following example specifies a directory prefix of "/my/docs/".
    template.setDirectory("/my/docs/");
  5. Get the document's content. For example, by using an InputStream.
    FileInputStream docStream = 
        new FileInputStream("data" + File.separator + filename);
  6. Create a handle associated with the input stream to receive the document's content. How you get content determines which handle you use. Use the handle's set() method to associate it with the desired stream.
    InputStreamHandle handle = new InputStreamHandle(docStream);
  7. Insert the document into the database by calling a create() method on the DocumentManager, passing in a URI template and the handle. Use the returned DocumentDescriptor to obtain the generated URI.
    DocumentDescriptor desc = docMgr.create(template, handle);
  8. When finished with the database, release the connection resources by calling the DatabaseClient object's release() method.
    client.release();

Format-Specific Write Capabilities

When inserting or updating a binary document, you can request metadata extraction using BinaryDocumentManager.setMetadataExtraction. For an example, see Writing A Binary Document.

When inserting or updating an XML document, you can request XML repair using XMLDocumentManager.setDocumentRepair.

See the JavaDoc for details.

Document Deletion

To delete one or more documents, call DocumentManager.delete and pass in the URI(s) of the documents.

To delete documents, an application must authenticate as a user with at least one of the rest-writer or rest-admin roles (or as a user with the admin role).

The following example shows how to delete an XML document from the database.

  1. Create a com.marklogic.client.DatabaseClient for connecting to the database.
    DatabaseClient client = DatabaseClientFactory.newClient(
       host, port, user, password, authType);
  2. If you have not already done so, use the DatabaseClient object to create a com.marklogic.client.document.DocumentManager object of the appropriate subclass for the document format (XML, text, JSON, or binary).
    XMLDocumentManager docMgr = client.newXMLDocumentManager();
  3. Delete the document(s). For example, the following statement deletes 2 documents:
    docMgr.delete("/example/doc1.xml", "/example/doc2.json");
  4. When finished with the database, release the connection resources by calling the DatabaseClient object's release() method.
    client.release();

Reading Document Content

Reading requires a handle to access document content.

Note that no changes you make to a document or its metadata persist until you write the document out to the database. Within your application, you are only manipulating it on the client, and those changes will vanish when the application ends. The database content is persistent until and unless a write or delete operation changes it.

If you read content with a stream, you must close the stream when done. If you do not close the stream, HTTP clients do not know that you are finished and there are fewer connections available in the connection pool.

The basic steps to read a document from the database are:

  1. Create a com.marklogic.client.DatabaseClient for connecting to the database.
    DatabaseClient client = DatabaseClientFactory.newClient(
       host, port, user, password, authType);
  2. If you have not already done so, use the DatabaseClient object to create a com.marklogic.client.document.DocumentManager object of the appropriate subclass for the document format (XML, text, JSON, or binary).
    XMLDocumentManager docMgr = client.newXMLDocumentManager();
  3. Create a handle to receive the document's content. For information on handles and the wide variety of handle types, see Handles. This example uses a com.marklogic.client.io.DOMhandle object.
    DOMHandle handle = new DOMHandle();
  4. Read the document's content by calling a read() method on the DocumentManager, with arguments of the document's URI and the handle. Here, assume docId contains the document's URI.
    docMgr.read(docId, handle);
  5. Access the content by calling a get() method on the handle. For example, DomHandle.get returns a W3C Document object. There are many alternatives.
    Document document = handle.get();
  6. When finished with the database, release the connection resources by calling the DatabaseClient object's release() method.
    client.release();

Writing A Binary Document

To insert or update a binary document, use a handle containing your binary content with com.marklogic.client.document.BinaryDocumentManager. You can use any handle that implements BinaryWriteHandle, such as BytesHandle or FileHandle.

No metadata extraction is performed by default. You can request metadata extraction and specify how it is saved by calling BinaryDocumentManager.setMetadataExtraction().

The following example reads a JPEG image from a file named my.png and inserts it into the database as a binary document with URI /images/my.png. During insertion, metadata is extracted from the binary content and saved as document properties.

String docId = "/example/my.png";
String mimetype = "image/png";

BinaryDocumentManager docMgr = client.newBinaryDocumentManager();
docMgr.setMetadataExtraction(MetadataExtraction.PROPERTIES);

docMgr.write(
    docId, 
    new FileHandle().with(new File("my.png")).withMimetype(mimetype)
  );

Reading Content From A Binary Document

There are several ways to read content from a binary document.

To stream binary content, use InputStream as follows:

InputStream byteStream = 
    docMgr.read(docID, new InputStreamHandle()).get();

To buffer the binary content, use com.marklogic.client.io.BytesHandle object as follows:

byte[] buf = docMgr.read(docID, new BytesHandle()).get();

Or you can read only part of the content:

BytesHandle handle = new BytesHandle();
buf = docMgr.read(docId, handle, 9, 10).get();

Reading, Modifying, and Writing Metadata

Reading and writing document metadata from and to the database are very similar operations to reading and writing document content. Each requires calling methods on com.marklogic.client.document.DocumentManager. The handle for metadata can be a DocumentMetadataHandle to modify metadata in a POJO, or it can be raw XML or JSON.

You can perform operations on the metadata associated with documents such as collections, permissions, properties, and quality. This section describes those metadata operations and includes the following parts:

Document Metadata

The following are the metadata types in the Java API:

  • COLLECTIONS: Document collections, a non-hierarchical way of organizing documents in the database. For details, see Collections Metadata
  • PERMISSIONS: Document permissions. For details, see Permissions Metadata.
  • PROPERTIES: Document properties. Property-value pairs associated with the document. For details, see Properties Metadata.
  • QUALITY: Document search quality. Helps determine which documents are of the best quality. For details, see Quality Metadata.

The enum DocumentManager.Metadata enumerates the metadata categories (including ALL). They are described in detail later in this chapter.

Reading Document Metadata

The basic steps needed to read a document's metadata are:

  1. Create a com.marklogic.client.DatabaseClient for connecting to the database.
    DatabaseClient client = DatabaseClientFactory.newClient(
       host, port, user, password, authType);
  2. If you have not already done so, use the DatabaseClient object to create a com.marklogic.client.document.DocumentManager object of the appropriate subclass for the document format (XML, text, JSON, or binary).
    XMLDocumentManager docMgr = client.newXMLDocumentManager();
  3. Create a com.marklogic.client.io.DocumentMetadataHandle object, which will receive the document's metadata. Alternately, you can create raw XML or JSON.
    DocumentMetadataHandle metadataHandle = new DocumentMetadataHandle();
  4. If you also want to get the document's content, create a handle to receive it. Note that you need separate handles for a document's content and metadata.
    DOMHandle docHandle = new DOMHandle();
  5. Read the document's metadata by calling a readMetadata() method on the DocumentManager, with an argument of the metadata handle. Note that you can also call read() with an additional argument of a content handle so that it will read the metadata into the metadata handle and the content into the content handle in a single operation. To call read(), an application must authenticate as rest-reader, rest-writer, or rest-admin. Below, docId is a variable containing a document URI.
    //read only the metadata into a handle
    docMgr.readMetadata(docId, metadataHandle); 
    //read metadata and content
    docMgr.read(docId, metadataHandle, docHandle); 
  6. Access the metadata by calling get() methods on the metadata handle. Later sections in this chapter show how to access the other types of metadata.
    DocumentCollections collections = metadataHandle.getCollections();
    Document document = contentHandle.get();
  7. When finished with the database, release the connection resources by calling the DatabaseClient object's release() method.
    client.release();

By default, DocumentManager reads and writes all categories of metadata. To read or write a subset of the metadata categories, configure DocumentManager by calling setMetadataCategories().

Collections Metadata

Collections are a way to organize documents in a database. A collection defines a set of documents in the database. You can set documents to be in any number of collections either at the time the document is created or by updating a document. Searches against collections are both efficient and convenient. For more details on collections, see Collections in the Search Developer's Guide.

The Java API allows you to read and manipulate collections metadata using the com.marklogic.client.io.DocumentMetadataHandle.DocumentCollections. Collections are named by specifying a URI. A collection URI serves as an identifier, and it can be any valid URI.

The code in this section assumes a DocumentManager object of an appropriate type for the document, docMgr, and a string containing a document URI, docId, have been created.

To get all collections for a document and put them in an array, do the following:

//Get the set of collections the document belongs to and put in array.
DocumentCollections collections = metadataHandle.getCollections();

To check if a collection URI exists in a document's set of collections, do the following:

collections.contains("/collection_name/collection_name2");

To add a document to one or more collections, do the following:

collections.addAll("/shakespeare/sonnets", "/shakespeare/plays");

To remove a document from a collection, do the following:

collections.remove("/shakespeare/sonnets");

To remove a document from all its collections, do the following:

collections.clear();

Properties Metadata

Manipulate properties metadata using the com.marklogic.client.io.DocumentMetadataHandle.DocumentProperties class.

The code in this section assumes a DocumentManager object, docMgr, and a string containing a document's URI, docId, have been created.

To get all of a document's properties metadata, do the following:

DocumentProperties properties = metadataHandle.getProperties();

DocumentProperties objects represent a document's properties as a map.

To check if a document's properties contain a specific property name, do the following:

exists = properties.containsKey("name");

To get a specific property's value do the following:

value = metadataHandle.getProperties("name");

You can add any new property names and values to a document that you want. To add a new property or change the value of an existing property in a document's metadata do the following:

metadataHandle.getProperties().put("name", "value");

Quality Metadata

The code in this section assumes a com.marklogic.client.io.DocumentManager object, docMgr, and a string containing a document's URI, docId, have been created.

The quality metadata affects the ranking of documents for use in searches by creating a multiplier for calculating the score for that document, and the default value for quality in the Java API is 0.

To get a document's search quality metadata value do the following:

int quality = metadataHandle.getQuality();

To set a document's search quality value do the following:

metadataHandle.setQuality(3);

Permissions Metadata

Permissions on documents control who can access a document for the capabilities of read, update, insert, and execute. To perform one of these operations on a document, a user must have a role corresponding to the permission for each capability needed. For details on permissions and on the security model in MarkLogic Server, see the Understanding and Using Security Guide.

The code in this section assumes a DocumentManager object, docMgr, and a string containing a document's URI, docId, have been created. Manipulate document properties using the class com.marklogic.client.io.DocumentMetadataHandle.DocumentPermissions.

MarkLogic Server defines permissions using roles and capabilities.

The allowed values for capabilities are those in the enum com.marklogic.client.io.DocumentMetadataHandle.Capability:

  • EXECUTE - Permission to execute the document.
  • INSERT - Permission to create but not modify or delete the document.
  • READ - Permission to read the document but not modify it..
  • UPDATE - Permission to create, modify, or delete the document, but not to read it.

Roles are assigned to users via the Admin Interface or through other administrative tools, and cannot be assigned via the Java API. You can, however, control permissions on documents as part of their metadata.

To get permissions metadata for a document, do the following:

DocumentPermissions permissions = metadataHandle.getPermissions()

metadataHandle.getPermissions().add("app-user", 
   Capability.UPDATE, Capability.READ);

Manipulating Document Metadata In Your Application

A DocumentMetadataHandle represents metadata as a POJO. A DocumentMetadataHandle has several methods for manipulating a document's metadata. That may not be how you want to work with the metadata, however. If you would prefer to work with it as XML, then read it with an XML handle. If you would prefer to work with it as JSON, read it with a JSON handle. A StringHandle can use either XML or JSON, defaulting to XML.

To specify the format for reading content, use setFormat(), as in the following example:

StringHandle metadataHandle = new StringHandle();
metadataHandle.setFormat(Format.JSON);

Writing Metadata

When you are finished modifying metadata categories, you must write it to the database to persist it. Note that the above operations all only change the document's metadata stored on the client, and do not change the metadata for document in the database. To write the metadata changes to the database, as well as the document content, do the following:

InputStreamHandle handle = new InputStreamHandle(docStream);
docMgr.write(docId, metadataHandle, handle);

Working with Temporal Documents

Most document write operations on JSON and XML documents enable you to work with temporal documents. Temporal-aware document inserts and updates are made available through the com.marklogic.client.document.TemporalDocumentManager interface. JSONDocumentManager and XMLDocumentManager implement TemporalDocumentManager.

The TemporalDocumentManager interface exposes methods for creating, updating, and deleting documents in temporal collections.

For more details, see the Temporal Developer's Guide and the JavaDoc in the Java Client API Documentation.

Conversion of Document Encoding

The Java API handles encoding conversions for you, but you have to:

  • know the encoding
  • use the appropriate handle

If you specify the encoding and it turns out to be the wrong encoding, then the conversion will likely not turn out as you expect.

MarkLogic Server stores text, XML, and JSON as UTF-8. In Java, characters in memory and reading streams are UTF-16. The Java API converts characters to and from UTF-8 automatically.

When writing documents to the server, you need to know if they are already UTF-8 encoded. If a document is not UTF-8, you must specify its encoding or you are likely to end up with data that has incorrect characters due to the incorrect encoding. If you specify a non-UTF-8 encoding, the Java API will automatically convert the encoding to UTF-8 when writing to MarkLogic.

When writing characters to or reading characters from a file, Java defaults to the platform's standard encoding. For example, there is different platform encoding on Linux than Windows.

XML supports multiple encodings as defined by the header (called an XML declaration):

<?xml version="1.0" encoding ="utf-8">

The XML declaration declares a file's encoding. XML parsing tools, including handles, can determing encoding from this and do the conversion for you.

When writing character data to the database, you need to pick an appropriate handle type, depending on your intent and circumstances.

Depending on your application, you may need to be aware that MarkLogic Server normalizes text to precomposed Unicode characters for efficiency. Unicode abstract characters can either be precomposed (one character) or decomposed (two characters). If you write a decomposed Unicode document to MarkLogic Server and then read it back, you will get back precomposed Unicode. Usually, you do not need to care if characters are precomposed or decomposed. This Unicode issue only affects some characters, and many APIs abstract away the difference. For instance, the Java collator treats the precomposed and decomposed forms of a character as the same character. If your application needs to compensate for this difference, you can use java.text.Normalizer; for details, see:

http://docs.oracle.com/javase/6/docs/api/java/text/Normalizer.html

The following table describes possible cases for reading character data with recommended handles to use in each case.

Read Condition Recommended Handle(s)
If reading binary data: Use BytesHandle, FileHandle, or InputStreamHandle.
If reading character data from the database: BytesHandle, FileHandle, InputStreamHandle, and the XML handles are encoded as UTF-8. StringHandle and ReaderHandle convert to UTF-16.

The following table describes possible cases for writing character data with recommended handles to use in each case.

Write Condition Recommended Handle(s)
If the data you are writing is a Java string: Use StringHandle; it converts on write from UTF-16 to UTF-8.
If writing binary data: Use BytesHandle, FileHandle, or InputStreamHandle.
If the data you are writing is encoded as UTF-8 and you do not need to modify the data: Use BytesHandle, FileHandle, or InputStreamHandle.
If it is XML that declares an encoding other than UTF-8 in the XML declaration and you do not need to modify the data: Use InputSourceHandle, XMLEventReaderHandle, or XMLStreamReaderHandle; these convert to UTF-8.
If the character data to write is XML that declares the encoding in a prolog and you need to modify the data: Use DOMHandle, SourceHandle, or create a handle class on an open source DOM. For examples of the latter, see JDOMHandle, XOMHandle, or DOM4JHandle in the package com.marklogic.client.extra. All these classes convert to UTF-8.
If the character data to write has a known encoding other than UTF-8 and you don't need to modify the data: Use ReaderHandle and specify the encoding when creating the Reader (as usual in Java); these convert to UTF-8.
If the character data to write is XML with a known but undeclared encoding and you need to modify the data:

Use DOMHandle with a DocumentBuilder parsing an InputSource with a specified encoding as in:

DOMHandle handle = new DOMHandle();
handle.set(
 handle.getFactory().newDocumentBuilder()
 parse(newInputSource(...reader
         specifying charset ...)));

or Use SourceHandle with a StreamReader on a Reader with a specified encoding as in:

SourceHandle handle = new SourceHandle();
handle.set(new StreamSource(...
           reader specifying charset
          ...));
If the character data to write is JSON and you need to modify the data: Consider using a JSON library such as Jackson or GSON. See com.marklogic.client.extra.JacksonHandle for an example.
If the character data to write is text other than JSON or XML and you need to modify the data: Consider using a StreamTokenizer with a Reader, or Pattern with a String

Partially Updating Document Content and Metadata

The interface com.marklogic.client.document.DocumentPatchBuilder enables you to update a portion of an existing document or its metadata. This section covers the following topics:

Introduction to Content and Metadata Patching

A partial update is an update you apply to a portion of a document or metadata, rather than replacing an entire document or all of the metadata. For example, inserting an XML element or attribute or changing the value associated with a JSON property. You can only apply partial content updates to XML and JSON documents. You can apply partial metadata updates to any document type.

Use a partial update to do the following operations:

  • Add, replace, or delete an XML element, XML attribute, or JSON object or array item of an existing document.
  • Add, replace, or delete a subset of the metadata of an existing document. For example, modify a permission or insert a property.
  • Dynamically generate replacement content or metadata on MarkLogic Server using builtin or user-defined functions. For details, see Construct Replacement Data on the Server.

You can apply multiple updates in a single patch, and you can update both content and metadata in the same patch.

A patch is a partial update descriptor, expressed in XML or JSON, that tells MarkLogic Server where to apply an update and what update to apply. Four operations are available in a patch: insert, replace, replace-insert, and delete. (A replace-insert operation functions as a replace, as long as at least one match exists for the target content; if there are no matches, then the operation functions as an insert.)

Patch operations can target XML elements and attributes, JSON property values and array items, and data values. You identify the target of an operation using XPath and JSONPath expressions. When inserting new content or metadata, the insertion point is further defined by specifying the position; for details, see How Position Affects the Insertion Point in the REST Application Developer's Guide. Note that you cannot patch unnamed JSON entities; for details, see Limitations of JSON Path Expressions in the REST Application Developer's Guide.

When applying a patch to document content, the patch format must match the document format: An XML patch for an XML document, a JSON patch for a JSON document. You cannot patch the content of other document types. You can patch metadata for all document types. A metadata-only patch can be in either XML or JSON. A patch that modifies both content and metadata must match the document content type.

You can construct a patch from raw JSON or XML, or using one of the following builder interfaces:

  • com.marklogic.client.document.DocumentPatchBuilder
  • com.marklogic.client.document.DocumentMetadataPatchBuilder

The patch builder interface contains value and fragment oriented methods, such as replaceValue and replaceFragment. You can use the *Value methods when the new value is an atomic value, such as a string, number, or boolean. Use the *Fragment methods when the new value is a complex structure, such as an XML element or JSON object or array.

Apply a patch by passing a handle to it to the patch() method of a DocumentManager. The following example sketches construction of a patch using a builder, and then applying the patch to an XML document. The patch inserts a <child/> element as the last child element of the node addressed by the XPath expression /data.

DocumentPatchBuilder xmlPatchBldr = xmlDocMgr.newPatchBuilder();
DocumentPatchHandle xmlPatch = 
    xmlPatchBldr.insertFragment(
        "/data", 
        Position.LAST_CHILD,
        "<child>the last one</child>")
      .build();
xmlDocMgr.patch(docId, xmlPatch);

For detailed instructions, see Basic Steps for Patching Documents and Metadata.

If a patch contains multiple operations, they are applied independently to the target document. That is, within the same patch, one operation does not affect the context path or select path results or the content changes of another. Each operation in a patch is applied independently to every matched node. If any operation in a patch fails with an error, the entire patch fails.

Content transformations are not directly supported in a partial update. However, you can implement a custom replacement content generation function to achieve the same effect. For details, see Construct Replacement Data on the Server.

Basic Steps for Patching Documents and Metadata

Follow this procedure to use a builder to create and apply a patch to the contents of an XML or JSON document, or to the metadata of any type of document. To construct a patch without using a builder, see Construct a Patch From Raw XML or JSON.

You can combine content and metadata updates in the same patch. When you do so, the patch format must match the content type of the documents. When you construct a patch that only modifies metadata, you can use either XML or JSON as the format.

  1. If you have not already done so, connect to the database, storing the connection in a com.marklogic.client.DatabaseClient object.
    DatabaseClient client = DatabaseClientFactory.newClient(
      host, port, user, password, authType);
  2. If you have not already done so, use the DatabaseClient object to create a com.marklogic.client.document.DocumentManager object of the appropriate subclass for the document content you want to access (XML, JSON, binary, or text). In this example code, an XMLDocumentManager.
    XMLDocumentManager docMgr = client.newXMLDocumentManager();

    You can only apply content patches to XML and JSON documents.

  3. Create a com.marklogic.client.document.DocumentPatchBuilder using the document manager. If you are patching content, you must specify a Format corresponding to the target document type, either XML or JSON. For example:
    DocumentPatchBuilder builder = docMgr.newPatchBuilder(Format.XML);
  4. Call the patch builder methods to define insert, replace, replace-insert, and delete operations for the patch. The following example adds an element insertion operation:
    builder.insertFragment("/data", Position.LAST_CHILD,
        "<child>the last one</child>");

    For details on identify the target content for an operation, see Defining the Context for a Patch Operation.

  5. Create a handle associated with the patch using DocumentPatchBuilder.build(). For example:
    DocumentPatchHandle handle = builder.build();

    Once you call build(), the patch contents are fixed. Subsequent calls to define additional operation, such as calling insertFragment again, will have no effect.

  6. Apply the patch by calling a patch() method on the DocumentManager, with arguments of the document's URI and the handle.
    docMgr.patch(docId, handle);
  7. When finished with the database, release the connection resources by calling the DatabaseClient object's release() method.
    client.release();

Construct a Patch From Raw XML or JSON

This section describes how to create and apply a patch that you construct directly using XML or JSON. To construct a patch using a Java builder, see Basic Steps for Patching Documents and Metadata.

When you construct a patch that modifies both content and metadata, the patch format must match the content type of the target XML or JSON document. When you construct a patch that only modifies metadata, the patch format can use either XML or JSON, and the patch can be applied to the metadata of any type of document (XML, JSON, text, or binary).

For examples of raw patches, see XML Examples of Partial Updates or JSON Examples of Partial Update in the REST Application Developer's Guide:

Follow this procedure to create and apply a raw XML or JSON patch to the contents of an XML or JSON document, or to the metadata of any type of document.

  1. Create a JSON or XML representation of the patch operations, using the tools or library of your choice. For syntax, see XML Patch Reference and JSON Patch Reference and in the REST Application Developer's Guide. The following example uses a String representation of a patch that inserts an element in an XML document:
    String xmlPatch = 
        "<rapi:patch xmlns:rapi='http://marklogic.com/rest-api'>" +
          "<rapi:insert context='/data' position='last-child'>" +
            "<child>the last one</child>" +
          "</rapi:insert>" +
        "</rapi:patch>";
  2. If you have not already done so, connect to the database, storing the connection in a com.marklogic.client.DatabaseClient object.
    DatabaseClient client = DatabaseClientFactory.newClient(
      host, port, user, password, authType);
  3. If you have not already done so, use the DatabaseClient object to create a com.marklogic.client.document.DocumentManager object of the appropriate subclass for the document content you want to access (XML, JSON, binary, or text). In this example code, an XMLDocumentManager.
    XMLDocumentManager docMgr = client.newXMLDocumentManager();

    You can only apply content patches to XML and JSON documents.

  4. If you represented your patch as XML, create a handle that implements DocumentPatchHandle and associate your patch with the handle. For example:
    DocumentPatchHandle handle = new StringHandle(xmlPatch);
  5. If you represented your patch as JSON, create a handle that implements DocumentPatchHandle, set the handle content format to JSON, and associate your patch with the handle. For example:
    DocumentPatchHandle handle = new StringHandle();
    handle.withFormat(Format.JSON).set(jsonPatch);
  6. Apply the patch by calling a patch() method on the DocumentManager, with arguments of the document's URI and the handle.
    docMgr.patch(docId, handle);
  7. When finished with the database, release the connection resources by calling the DatabaseClient object's release() method.
    client.release();

Defining the Context for a Patch Operation

When you insert, replace, or delete content or metadata, the patch definition must include enough context to tell MarkLogic Server what XML or JSON components to operate on. For example, which XML element or JSON property to modify, where to insert a new element or object, or which element, object, or value to replace.

When you create a patch using a builder, you specify the context through the contextPath and selectPath parameters of builder methods such as DocumentPatchBuilder.insertFragment() or DocumentPatchBuilder.replaceValue(). When you create a patch from raw XML or JSON, you specify the operation context through the context and select XML attributes or JSON properties.

For XML documents, you specify the context using an XPath (XML) expression.The XPath you can use is limited to the subset that can be used to define a path range index. For details, see Path Expressions Usable in Index Definitions in the REST Application Developer's Guide.

For JSON documents, use JSONPath (JSON). The JSONPath you can use has the same limitation as those that apply to XPath. For details, see Introduction to JSONPath and Path Expressions Usable in Index Definitions in the REST Application Developer's Guide.

Example: Replacing Parts of a JSON Document

This example uses patch operations to perform the document transformation shown in the table below. The patch eplaces one JSON property with another, replaces the simple value of a property, and replaces the array value of a property.

Before Update After Update
{ "parent": {
    "child1": {
      "grandchild": "value"
    },
    "child2": "simple",
    "child3": [ "av1", "av2" ]
} }
{ "parent": {
    "child1": {
      "REPLACE1": "REPLACED1"
    },
    "child2": "REPLACED2",
    "child3": [
      "REPLACED3a",
      "REPLACED3b"
    ]
} }

The raw patch that applies these changes is shown below.

{ "patch": [
    { "replace": {
        "select": "/parent/child1",
        "content": { "REPLACE1": "REPLACED1" }
    }},
    { "replace": {
        "select": "/parent/child2",
        "content": "REPLACED2"
    }},
    { "replace": {
        "select": "/parent/array-node('child3')",
        "content": [ "REPLACED3a", "REPLACED3b" ]
    }}
]}

The following code demonstrates how to use the PatchBuilder interface to create the equivalent raw patch. A Jackson ObjectMapper is used to construct the complex replacement values (the object value of child1 and the array value of child3).

JSONDocumentManager jdm = client.newJSONDocumentManager();
DocumentPatchBuilder pb = jdm.newPatchBuilder();
pb.pathLanguage(DocumentPatchBuilder.PathLanguage.XPATH);
ObjectMapper mapper = new ObjectMapper();

pb.replaceFragment("/parent/child1", 
        mapper.createObjectNode().put("REPLACE1", "REPLACED1"));
pb.replaceValue("child2", "REPLACED2");
pb.replaceFragment("/parent/array-node('child3')", 
        mapper.createArrayNode().add("REPLACED3a").add("REPLACED3b"));
jdm.patch(URI, pb.build());

Managing XML Namespaces in a Patch

Namespaces potentially impact two parts of a patch operation:

  • The XPath expression(s) that define the context for an operation, such as which nodes to replace or where to insert new content.
  • New or replacement content.

Your patch must include definitions of any namespaces used in these contexts. The way you do so varies, depending on whether or not you use a builder to construct your patch. This section covers the following topics:

Defining Namespaces With a Builder

When you construct a patch with DocumentPatchBuilder, define any namespaces used in XPath context or select expressions by calling DocumentPatchBuilder.setNamespaces(). Such namespace definitions are patch-wide. That is, they apply to all operations in the patch.

Namespaces used in insertion or replacement content can either be patch-wide, as with XPath expressions, or defined inline on content elements.

The patch generated by the builder pre-defines the following namespace aliases for you:

  • xmlns:rapi="http://marklogic.com/rest-api"
  • xmlns:prop="http://marklogic.com/xdmp/property"
  • xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  • xmlns:xi="http://www.w3.org/2001/XMLSchema"

The following example defines three namespace aliases (r, t, and n) and uses them in defining the insertion context and the content to be inserted.

import com.marklogic.client.util.EditableNamespaceContext;
...
// construct a list of namespace definitions
EditableNamespaceContext namespaces = new EditableNamespaceContext();
namespaces.put("r", "http://root.org");
namespaces.put("t", "http://target.org");
namespaces.put("n", "http://new.org");

// add the namespace definitions to the patch
DocumentPatchBuilder builder = docMgr.newPatchBuilder();
builder.setNamespaces(namespaces);

// use the namespace aliases when definition operations
String newElem = "<n:new>";
builder.insertFragment(
    "/r:root/t:target", Position.LAST_CHILD, newElem);

You can also define the content namespace element n inline, as shown in the following example:

String newElem = "<n:new xmlns:n=\"http://new.org\">";
Defining Namespaces in Raw XML

When you construct a patch directly in XML, define any namespaces used in XPath context or select expressions on the root <patch/> element. Namespace definitions are patch-wide and apply to both XPath expressions and insertion or replacement content.

The <patch /> element must be defined in the namespace http://marklogic.com/rest-api. It is recommended that you use a namespace alias for this namespace so that element and attribute references in your patch that are not namespace qualified do not end up in the http://marklogic.com/rest-api namespace.

The following example defines four namespace aliases, one for the patch (rapi) and three content-specific aliases (r, n, and t). The content-specific aliases are used in defining the insertion context and the content to be inserted.

<rapi:patch xmlns:rapi="http://marklogic.com/rest-api"
    xmlns:r="http://root.org" xmlns:t="http://target.org"
    xmlns n="http://new.org">
  <rapi:insert context="/r:root/t:target" position="last-child">
    <n:new />
  </rapi:insert>
</rapi:patch>

For more details, see Managing XML Namespaces in a Patch in the REST Application Developer's Guide.

Construct Replacement Data on the Server

This section describes using builtin or user-defined XQuery replacement functions to generate the content for a partial update replace or replace-insert operation dynamically on MarkLogic Server.

The builtin functions support simple arithmetic and string manipulation. For example, you can use a builtin function to increment the current value of numeric data or concatenate strings. For more complex operations, create and install a user-defined function.

To create a user-defined replacement function, see Writing a User-Defined Replacement Constructor in the REST Application Developer's Guide. Install your implementation into the modules database associated with your REST Server; for details, see Managing Dependent Libraries and Other Assets.

To apply a builtin or user-defined server-side function to a patch operation when you create a patch with a patch builder, use a DocumentMetadataPatchBuilder.CallBuilder, obtained by calling DocumentMetadataPatchBuilder.call(). The builtin functions are exposed as methods of CallBuilder. The following example adds a replace operation to a patch that multiplies the current data value in child elements by 3.

DocumentPatchBuilder builder = docMgr.newPatchBuilder();
builder.replaceApply("child", builder.call().multiply(3));

To apply the same operation to a raw XML or JSON patch, use the apply XML attribute or JSON property of the operation. The following raw patches are equivalent to the patch produced by the above builder example. For details, see Constructing Replacement Data on the Server in the REST Application Developer's Guide.

XML JSON
<rapi:patch
    xmlns:rapi="http://marklogic.com/rest-api">
  <rapi:replace 
    select="child"
    apply="ml.multiply">3</rapi:replace>
</rapi:patch>
{"patch": [
  {"replace": {
    "select": "child",
    "apply": "ml.multiply",
    "content": 3
  } }
] }

To apply a user-defined replacement function using a patch builder, first associate the module containing the function with the patch by calling DocumentPatchBuilder.library(), and then apply the function to an operation using one of the CallBuilder.applyLibrary* methods. The following example applies the function my-func in the module namespace http://my/ns, implemented in the XQuery library module installed in the modules database at /my.domain/my-lib.xqy.

DocumentPatchBuilder builder = docMgr.newPatchBuilder();

builder.library("http://my/ns", "/my.domain/my-lib.xqy");
builder.replaceApply("child", builder.call().applyLibrary("my-func");

When you construct a raw XML or JSON patch, associate the containing library module with the patch using the replace-library patch component, then apply the function to a replace or replace-insert operation using the apply XML attribute or JSON property. The following examples are equivalent to the above builder code. For more details, see Using a Replacement Constructor Function in the REST Application Developer's Guide.

XML JSON
<rapi:patch
    xmlns:rapi="http://marklogic.com/rest-api">
  <rapi:replace-library
    at="/my.domain/my-lib.xqy" 
    ns="http://my/ns" />
  <rapi:replace select="child" apply="my-func"/>
</rapi:patch>
{"patch": [
  {"replace-library": {
    "at": "/my.domain/my-lib.xqy",
    "ns": "http://my/ns"
  } },
  {"replace": {
    "select": "child",
    "apply": "my-func"
  } }
] }

« Previous chapter
Next chapter »