Application Developer's Guide (PDF)

Application Developer's Guide — Chapter 14

« Previous chapter
Next chapter »

Reusing Content With Modular Document Applications

This chapter describes how to create applications that reuse content by using XML that includes other content. It contains the following sections:

Modular Documents

A modular document is an XML document that references other documents or parts of other documents for some or all of its content. If you fetch the referenced document parts and place their contents as child elements of the elements in which they are referenced, then that is called expanding the document. If you expand all references, including any references in expanded documents (recursively, until there is nothing left to expand), then the resulting document is called the expanded document. The expanded document can then be used for searching, allowing you to get relevance-ranked results where the relevance is based on the entire content in a single document. Modular documents use the XInclude W3C recommendation as a way to specify the referenced documents and document parts.

Modular documents allow you to manage and reuse content. MarkLogic Server includes a Content Processing Framework (CPF) application that expands the documents based on all of the XInclude references. The CPF application creates a new document for the expanded document, leaving the original documents untouched. If any of the parts are updated, the expanded document is recreated, automatically keeping the expanded document up to date.

The CPF application for modular documents takes care of all of the work involved in expanding the documents. All you need to do is add or update documents in the database that have XInclude references, and then anything under a CPF domain is automatically expanded. For details on CPF, see the Content Processing Framework Guide.

Content can be reused by referencing it in multiple documents. For example, imagine you are a book publisher and you have boilerplate passages such as legal disclaimers, company information, and so on, that you include in many different titles. Each book can then reference the boilerplate documents. If you are using the CPF application, then if the boilerplate is updated, all of the documents are automatically updated. If you are not using the CPF application, you can still update the documents with a simple API call.

XInclude and XPointer

Modular documents use XInclude and XPointer technologies:

XInclude provides a syntax for including XML documents within other XML documents. It allows you to specify a relative or absolute URI for the document to include. XPointer provides a syntax for specifying parts of an XML document. It allows you to specify a node in the document using a syntax based on (but not quite the same as) XPath. MarkLogic Server supports the XPointer framework, and the element() and xmlns() schemes of XPointer, as well as the xpath() scheme:

The xmlns() scheme is used for namespace prefix bindings in the XPointer framework, the element() scheme is one syntax used to specify which elements to select out of the document in the XInclude href attribute, and the xpath() scheme is an alternate syntax (which looks much more like XPath than the element() scheme) to select elements from a document.

Each of these schemes is used within an attribute named xpointer. The xpointer attribute is an attribute of the <xi:include> element. If you specify a string corresponding to an idref, then it selects the element with that id attribute, as shown in Example: Simple id.

The examples that follow show XIncludes that use XPointer to select parts of documents:

Example: Simple id

Given a document /test2.xml with the following content:

<el-name>
  <p id="myID">This is the first para.</p>
  <p>This is the second para.</p>
</el-name>

The following selects the element with an id attribute with a value of myID from the /test2.xml document:

<xi:include href="/test2.xml" xpointer="myID" />

The expansion of this <xi:include> element is as follows:

<p id="myID" xml:base="/test2.xml">This is the first para.</p>

Example: xpath() Scheme

Given a document /test2.xml with the following content:

<el-name>
  <p id="myID">This is the first para.</p>
  <p>This is the second para.</p>
</el-name>

The following selects the second p element that is a child of the root element el-name from the /test2.xml document:

<xi:include href="/test2.xml" xpointer="xpath(/el-name/p[2])" />

The expansion of this <xi:include> element is as follows:

<p xml:base="/test2.xml">This is the second para.</p>

Example: element() Scheme

Given a document /test2.xml with the following content:

<el-name>
  <p id="myID">This is the first para.</p>
  <p>This is the second para.</p>
</el-name>

The following selects the second p element that is a child of the root element el-name from the /test2.xml document:

<xi:include href="/test2.xml" xpointer="element(/1/2)" />

The expansion of this <xi:include> element is as follows:

<p xml:base="/test2.xml">This is the second para.</p>

Example: xmlns() and xpath() Scheme

Given a document /test2.xml with the following content:

<pref:el-name xmlns:pref="pref-namespace">
  <pref:p id="myID">This is the first para.</pref:p>
  <pref:p>This is the second para.</pref:p>
</pref:el-name>

The following selects the first pref:p element that is a child of the root element pref:el-name from the /test2.xml document:

<xi:include href="/test2.xml" 
            xpointer="xmlns(pref=pref-namespace)
                      xpath(/pref:el-name/pref:p[1])" />

The expansion of this <xi:include> element is as follows:

<pref:p id="myID" xml:base="/test2.xml"
   xmlns:pref="pref-namespace">This is the first para.</pref:p>

Note that the namespace prefixes for the XPointer must be entered in an xmlns() scheme; it does not inherit the prefixes from the query context.

CPF XInclude Application and API

This section describes the XInclude CPF application code and includes the following parts:

XInclude Code and CPF Pipeline

You can either create your own modular documents application or use the XInclude pipeline in a CPF application. For details on CPF, see the Content Processing Framework Guide. The following are the XQuery libraries and CPF components used to create modular document applications:

  • The XQuery module library xinclude.xqy. The key function in this library is the xinc:node-expand function, which takes a node and recursively expands any XInclude references, returning the fully expanded node.
  • The XQuery module library xpointer.xqy.
  • The XInclude pipeline and its associated actions.
  • You can create custom pipelines based on the XInclude pipeline that use the following <options> to the XInclude pipeline. These options control the expansion of XInclude references for documents under the domain to which the pipeline is attached:
    • <destination-root> specifies the directory in which the expanded version of documents are saved. This must be a directory path in the database, and the expanded document will be saved to the URI that is the concatenation of this root and the base name of the unexpanded document. For example, if the URI of the unexpanded document is /mydocs/unexpanded/doc.xml, and the destination-root is set to /expanded-docs/, then this document is expanded into a document with the URI /expanded-docs/doc.xml.
    • <destination-collection> specifies the collection in which to put the expanded version. You can specify multiple collections by specifying multiple <destination-collection> elements in the pipeline.
    • <destination-quality> specifies the document quality for the expanded version. This must be an integer value, and higher positive numbers increase the relevance scores for matches against the document, while lower negative numbers decrease the relevance scores. The default quality on a document is 0, which does not change the relevance score.
    • The default is to use the same values as the unexpanded source.

Required Security Privileges--xinclude Role

The XInclude code requires the following privileges:

Therefore, any users who will be expanding documents require these privileges. There us a predefined role called xinclude that has the needed privileges to execute this code. You must either assign the xinclude role to your users or they must have the above execute privileges in order to run the XInclude code used in the XInclude CPF application.

Creating XML for Use in a Modular Document Application

The basic syntax for using XInclude is relatively simple. For each referenced document, you include an <xi:include> element with an href attribute that has a value of the referenced document URI, either relative to the document with the <xi:include> element or an absolute URI of a document in the database. When the document is expanded, the document referenced replaces the <xi:include> element. This section includes the following parts:

<xi:include> Elements

Element that have references to content in other documents are <xi:include> elements, where xi is bound to the http://www.w3.org/2001/XInclude namespace. Each xi:include element has an href attribute, which has the URI of the included document. The URI can be relative to the document containing the <xi:include> element or an absolute URI of a document in the database.

<xi:fallback> Elements

The XInclude specification has a mechanism to specify fallback content, which is content to use when expanding the document when the XInclude reference is not found. To specify fallback content, you add an <xi:fallback> element as a child of the <xi:include> element. Fallback content is optional, but it is good practice to specify it. As long as the xi:include href attributes resolve correctly, documents without <xi:fallback> elements will expand correctly. If an xi:include href attribute does not resolve correctly, however, and if there are no <xi:fallback> elements for the unresolved references, then the expansion will fail with an XI-BADFALLBACK exception.

The following is an example of an <xi:include> element with an <xi:fallback> element specified:

<xi:include href="/blahblah.xml">
  <xi:fallback><p>NOT FOUND</p></xi:fallback> 
</xi:include>

The <p>NOT FOUND</p> will be substituted when expanding the document with this <xi:include> element if the document with the URI /blahblah.xml is not found.

You can also put an <xi:include> element within the <xi:fallback> element to fallback to some content that is in the database, as follows:

<xi:include href="/blahblah.xml">
  <xi:fallback><xi:include href="/fallback.xml" /></xi:fallback> 
</xi:include>

The previous element says to include the document with the URI /blahblah.xml when expanding the document, and if that is not found, to use the content in /fallback.xml.

Simple Examples

The following is a simple example which creates two documents, then expands the one with the XInclude reference:

xquery version "1.0-ml";
declare namespace xi="http://www.w3.org/2001/XInclude";

xdmp:document-insert("/test1.xml", <document>
  <p>This is a sample document.</p>
  <xi:include href="test2.xml"/> 
</document>);

xquery version "1.0-ml";

xdmp:document-insert("/test2.xml", 
  <p>This document will get inserted where 
     the XInclude references it.</p>);

xquery version "1.0-ml";
import module namespace xinc="http://marklogic.com/xinclude" 
     at "/MarkLogic/xinclude/xinclude.xqy"; 

xinc:node-expand(fn:doc("/test1.xml"))

The following is the expanded document returned from the xinc:node-expand call:

<document>
  <p>This is a sample document.</p>
  <p xml:base="/test2.xml">This document will get inserted where 
     the XInclude references it.</p>
</document>

The base URI from the URI of the included content is added to the expanded node as an xml:base attribute.

You can include fallback content as shown in the following example:

xquery version "1.0-ml";
declare namespace xi="http://www.w3.org/2001/XInclude";

xdmp:document-insert("/test1.xml", <document>
  <p>This is a sample document.</p>
  <xi:include href="/blahblah.xml">
    <xi:fallback><p>NOT FOUND</p></xi:fallback>
  </xi:include>
</document>);

xquery version "1.0-ml";

xdmp:document-insert("/test2.xml", 
  <p>This document will get inserted where the XInclude references it.</p>);

xquery version "1.0-ml";

xdmp:document-insert("/fallback.xml", 
  <p>Sorry, no content found.</p>);

xquery version "1.0-ml";
import module namespace xinc="http://marklogic.com/xinclude" 
     at "/MarkLogic/xinclude/xinclude.xqy"; 

xinc:node-expand(fn:doc("/test1.xml"))

The following is the expanded document returned from the xinc:node-expand call:

<document>
  <p>This is a sample document.</p>
  <p xml:base="/test1.xml">NOT FOUND</p>
</document>

Setting Up a Modular Document Application

To set up a modular documents CPF application, you need to install CPF and create a domain under which documents with XInclude links will be expanded. For detailed information about the Content Processing Framework, including procedures for how to set it up and information about how it works, see the Content Processing Framework Guide.

To set up an XInclude modular document application, perform the following steps:

  1. Install Content Processing in your database, if it is not already installed. For example, if your database is named modular, In the Admin Interface click the Databases > modular > Content Processing link. If it is not already installed, the Content Processing Summary page will indicate that it is not installed. If it is not installed, click the Install tab and click install (you can install it with or without enabling conversion).
  2. Click the domains link from the left tree menu. Either create a new domain or modify an existing domain to encompass the scope of the documents you want processed with the XInclude processing. For details on domains, see the Content Processing Framework Guide.
  3. Under the domain you have chosen, click the Pipelines link from the left tree menu.
  4. Check the Status Change Handling and XInclude Processing pipelines. You can also attach other pipelines or detach other pipelines, depending if they are needed for your application.

    If you want to change any of the <options> settings on the XInclude Processing pipeline, copy that pipeline to another file, make the changes (make sure to change the value of the <pipeline-name> element as well), and load the pipeline XML file. It will then be available to attach to a domain. For details on the options for the XInclude pipeline, see CPF XInclude Application and API.

  5. Click OK. The Domain Pipeline Configuration screen shows the attached pipelines.

Any documents with XIncludes that are inserted or updated under your domain will now be expanded. The expanded document will have a URI ending in _expanded.xml. For example, if you insert a document with the URI /test.xml, the expanded document will be created with a URI of /test_xml_expanded.xml (assuming you did not modify the XInclude pipeline options).

If there are existing XInclude documents in the scope of the domain, they will not be expanded until they are updated.

« Previous chapter
Next chapter »
Powered by MarkLogic Server | Terms of Use | Privacy Policy