XQuery is designed to work well with XML content, allowing many convenient ways to search through XML elements and attributes as well as making it easy to output XML from an XQuery program. When working with XML, you must understand a little about the XML data model, and one fundamental aspect of the XML data model is namespaces. This chapter describes XML namespaces and how they are important in XQuery, and includes the following sections:
XML uses qualified names, also called QNames, to uniquely identify elements and attributes. A QName for an XML element or attribute has two parts: the namespace name and the local name. Together, the namespace and local name uniquely define how the element or attribute is identified. Additionally, the QName also retains its namespace prefix, if there is one. A namespace prefix binds a namespace URI to a specified string (the string is the prefix).
In XML and XQuery, element and attribute nodes are always in a namespace, even if that namespace is the empty namespace (sometimes called no namespace). Each namespace has a uniform resource identifier (URI) associated. A URI is essentially a unique string that identifies the namespace. That string can be bound to a namespace prefix, which is just a shorthand string which is used to identify a (usually longer) namespace name. When something is in the empty namespace, the namespace name is the empty string (""
).
There can also be a default element namespace defined for the module, as described in Declaring a Default Element Namespace in XQuery. The fact that every element is in a namespace, along with the fact that XPath expressions of an unknown node return the empty sequence, make it easy to have simple coding errors (or even typographic errors) that cause your query to be a valid XPath expression, but to return the empty string. For example, if you have a simple typographical error in a namespace declaration, then XPath expressions that you might expect to return nodes might return the empty sequence. Consider the following query against a database with XHTML content:
xquery version "1.0-ml"; declare namespace xh="http://www.w3.org/1999/html"; //xh:p
You might expect this to return all of the XHTML p
elements in the database, but instead it returns nothing (the empty sequence). If you look closely, though, you will notice that the namespace URI is misspelled (it is missing the x
in xhtml
). If you keep in mind that everything is in a namespace, it can help find many simple XQuery coding errors. The correct version of this query is as follows, and will return all of the XHTML p
elements:
xquery version "1.0-ml"; declare namespace xh="http://www.w3.org/1999/xhtml"; //xh:p
This section highlights the difference between the XML data model, used to programmatically access XML content, and the serialized form of XML, used to display the XML in human-readable form. The following topics are covered:
When an XQuery program accesses XML, it accesses it through the XML data model. The XML data model access nodes via their QNames, which are pairs of namespace name and local name. The XML data model does not store namespace prefixes. You can use namespace prefixes to access XML if those prefixes are in-scope in your XQuery (that is, if the prefixes are bound to a namespace). In-scope prefixes are a combination of any prefixes bound to a namespace in your query and the predefined namespace prefixes defined in Predefined Namespace Prefixes for Each Dialect.
The XML data model is aware of XML schema, and all XML nodes can optionally have XML types (for example, xs:string
, xs:dateTime
, xs:integer
, and so on). When you are creating library functions that might be called from a number of contexts, knowing that XQuery accesses the XML data model can help you to make your code robust. For example, you might have code that explicitly (or implicitly, using the XQuery rules) casts nodes to a particular XML type, enforcing strong typing in your code.
When XML nodes are transformed from their internal, XML data model representation to a human-readable form, the process is known as XML serialization. A serialized XML node contains all of the namespace information, although some namespace prefixes may or may not be included in the serialization. Serialized XML does not generally contain the type information or the schema information; it is up to the XQuery program to specify a schema for a given XML representation.
When serializing XML, there are five XML reserved characters that are serialized with their corresponding XML entities. These characters cannot appear as content in a serialized XML text node. The following table shows these five characters:
Character | XML Entity | Name of Character |
---|---|---|
|
" |
double quotation mark |
|
& |
ampersand |
|
' |
apostrophe |
|
< |
less-than sign |
|
> |
greater-than sign |
There are different ways to serialize the same XML content. The way XML content is serialized depends on how the content is constructed, the various namespace declarations in the query, and how the XML content was loaded into MarkLogic Server (for content loaded into a database). In particular, the ampersand character can be tricky to construct in an XQuery string, as it is an escape character to the XQuery parser. The ways to construct the ampersand character in XQuery are:
&
).<![CDATA[element content here]]>
), which tells the XQuery parser to read the content as character data.For example, consider the following query:
xquery version "1.0-ml"; declare default element namespace "my.namespace.hello"; <some-element><![CDATA[element content with & goes here]]></some-element>
If you evaluate this query, it returns the following serialization of the specified element:
<some-element xmlns="my.namespace.hello">element content with & goes here</some-element>
If you consider a similar query with a namespace prefix binding instead of the default element namespace declaration:
xquery version "1.0-ml"; declare namespace hello="my.namespace.hello"; <hello:some-element><![CDATA[element content with & goes here]]></hello:some-element>
If you evaluate this query, it returns the following serialization of the specified element:
<hello:some-element xmlns:hello="my.namespace.hello">element content with & goes here</hello:some-element>
Notice that in both cases, the &
character is escaped as an XML entity, and in each case there is an xmlns
attribute added to the serialization. In the first example, there is no prefix bound to the namespace, but in the second one there is (because it is declared in the query). Both serializations represent the exact same XML data model.
To construct the double quotation mark and apostrophe characters within a string quoted with one of these characters ('
or "
), you can use the character to escape itself, or you can quote the string with the other quote character, as follows:
"""" (: returns a single character: " :) '"' (: returns a single character: " :) '''' (: returns a single character: ' :) "'" (: returns a single character: ' :)
As seen in the previous example, XML has a namespace declaration called xmlns
, which is used to specify namespaces in XML. An xmlns
namespace declaration looks like an attribute (although it is not actually an attribute). It can either stand by itself or have a prefix appended to it, separated by a colon ( :
) character. Any xmlns
namespace declaration is inherited by all of its child elements, and if it has a prefix appended to it, the children also inherit the namespace prefix binding.
For example, the following XML serialization specifies that the XHTML namespace is inherited from the root element:
<html xmlns="http://www.w3.org/1999/xhtml"> <body><p>This is in the XHTML namespace</p></body> </html>
Each of the elements (html
, body
, and p
in this example) are in the XHTML namespace.
Similarly, an xmlns
namespace declaration with a prefix appended specifies that the prefix is inherited by the element children.
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:my="my.namespace"> <body> <p>This is in the XHTML namespace</p> <my:p>This element is in my.namespace</my:p> </body> </html>
One other subtlely about default namespaces using the xmlns
attribute in constructed elements is that any XPath statement that is constructed within an element constructor that uses an xmlns
default namespace will default to the namespace of the parent element. This can be unexpected if you are trying to write an XPath expression using QNames in no namespace. The following code sample demonstrates how this namespace XPath inheritance works.
xquery version "1.0-ml"; declare namespace foo="foo"; (: notice the element constructed in $x is in no namespace :) let $x := <a><b>hello</b></a> return ( <blah xmlns="foo">{$x/b}</blah>, <foo:blah>{$x/b}</foo:blah> ) (: Returns: <blah xmlns="foo"/> <foo:blah xmlns:foo="foo"><b>hello</b></foo:blah> Notice how in the first part of the return, the "b" in $x/b inherits the namespace from the parent element, which is constructed with a default namespace (xmlns="foo"), so it returns empty. In the second $x/b, the "b" is in no namespace. :)
There are some other subtleties of namespace inheritance in XML. For more details, see the XML Schema specification (http://www.w3.org/XML/Schema).
An XQuery program can declare a namespace as the default element namespace for any elements that do not have a namespace. By default, the default element namespace is no namespace, which is denoted by the empty string URI (""
). If you want to define a default element namespace for a query, add a declaration to the prolog similar to the following, which declares the XHTML namespace (http://www.w3.org/1999/xhtml
) as the default element namespace:
declare default element namespace "http://www.w3.org/1999/xhtml";
An XQuery program that has this prolog declaration will use the XHTML namespace for all elements where a namespace is not explicitly defined (for example, with a namespace prefix).
Declaring a default element namespace is a convenience and a style which some programmers find useful. While it is sometimes convenient (so you do not have to prefix element names, for example), it can also cause confusion in larger programs that use multiple namespaces, so for more complex programming efforts, explicitly defining namespaces is usually more clear.
In XML, elements and attributes are uniquely identified by a qualified names (QNames, as described in XML QNames, Local Names, and Namespaces). A QName is a pairing of a namespace name and a local name, and it uniquely describes an element or attribute name. XQuery also uses QNames to uniquely identify function names, variable names, and type names.
There are many functions that use QNames in XQuery, and all of the rules for in-scope namespaces apply to constructing those QNames. For example, if the namespace prefix my
is bound to the namespace URI my.namespace
in the scope of a query, then the following would construct a QName in that namespace with the local name some-element
:
xs:QName("my:some-element")
Similarly, you can construct this QName using the fn:QName
function as follows:
fn:QName("my.namespace", "some-element")
Because a prefix is not specified in the second parameter to the above function, the QName is defined to have a prefix of the empty string (""
).
Similarly, you can construct this QName with the prefix my
by using the fn:QName
function as follows:
fn:QName("my.namespace", "my:some-element")
XQuery functions and other language constructs that take a QName can use any in-scope namespace prefixes. For example, the following will construct an html
element in the XHTML namespace:
xquery version "1.0-ml"; declare namespace xh="http://www.w3.org/1999/xhtml"; element xh:html { "This is in the xhtml namespace." }
This section lists the namespaces that are predefined for each of the dialects supported in MarkLogic Server. When a prefix is predefined, you can use it in your XQuery without the need to define it in a declare namespace
prolog statement. It contains the following parts:
The following table lists the namespace prefixes and the corresponding URIs to which they are bound that are predefined in the 1.0-ml XQuery dialect.
The following table lists the namespace prefixes and the corresponding URIs to which they are bound that are predefined in the 1.0 XQuery dialect (strict XQuery 1.0).