MarkLogic Server evaluates XQuery programs against App Servers. This chapter describes ways of controlling the output, both by App Server configuration and with XQuery built-in functions. Primarily, the features described in this chapter apply to HTTP App Servers, although some of them are also valid with XDBC Servers and with the Task Server. This chapter contains the following sections:
A custom HTTP Server error page is a way to redirect application exceptions to an XQuery program. When any 400 or 500 HTTP exception is thrown (except for a 503 error), an XQuery module is evaluated and the results are returned to the client. Custom error pages typically provide more user-friendly messages to the end-user, but because the error page is an XQuery module, you can make it perform arbitrary work.
The XQuery module can get the HTTP error code and the contents of the HTTP response using the xdmp:get-response-code API. The XQuery module for the error handler also has access to the XQuery stack trace, if there is one; the XQuery stack trace is passed to the module as an external variable with the name
$error:errors in the XQuery
1.0-ml dialect and as
$err:errors in the XQuery
0.9-ml dialect (they are both bound to the same namespace, but the
err prefix is predefined in
error prefix is predefined in
If the error page itself throws an exception, that exception is passed to the client with the error code from the error page. It will also include a stack trace that includes the original error code and exception.
Error messages are thrown with an XML error stack trace that uses the
error.xsd schema. Stack trace includes any exceptions thrown, line numbers, and XQuery Version. Stack trace is accessible from custom error pages through the
$error:errors external variable. The following is a sample error XML output for an XQuery module with a syntax error:
<error:error xsi:schemaLocation="http://marklogic.com/xdmp/error error.xsd" xmlns:error="http://marklogic.com/xdmp/error" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <error:code>XDMP-CONTEXT</error:code> <error:name>err:XPDY0002</error:name> <error:xquery-version>1.0-ml</error:xquery-version> <error:message>Expression depends on the context where none is defined</error:message> <error:format-string>XDMP-CONTEXT: (err:XPDY0002) Expression depends on the context where none is defined</error:format-string> <error:retryable>false</error:retryable> <error:expr/> <error:data/> <error:stack> <error:frame> <error:uri>/blaz.xqy</error:uri> <error:line>1</error:line> <error:xquery-version>1.0-ml</error:xquery-version> </error:frame> </error:stack> </error:error>
To configure a custom error page for an HTTP App Server, enter the name of the XQuery module in the Error Handler field of an HTTP Server. If the path does not start with a slash (
/), then it is relative to the App Server root. If it does start with a slash (
/), then it follows the import rules described in Importing XQuery Modules, XSLT Stylesheets, and Resolving Paths.
If your App Server is configured to use a modules database (that is, it stores and executes its XQuery code in a database) then you should put an execute permission on the error handler module document. The execute permission is paired to a role, and all users of the App Server must have that role in order to execute the error handler; if a user does not have the role, then that user will not be able to execute the error handler module, and it will get a 401 (unauthorized) error instead of having the error be caught and handled by the error handler.
As a consequence of needing the execute permission on the error handler, if a user who is actually not authorized to run the error handler attempts to access the App Server, that user runs as the default user configured for the App Server until authentication. If authentication fails, then the error handler is called as the default user, but because that default user does not have permission to execute the error handler, the user is not able to find the error handler and a 404 error (not found) is returned. Therefore, if you want all users (including unauthorized users) to have permission to run the error handler, you should give the default user a role (it does not need to have any privileges on it) and assign an execute permission to the error handler paired with that role.
xquery version "1.0-ml"; declare variable $error:errors as node()* external; xdmp:set-response-content-type("text/plain"), xdmp:get-response-code(), $error:errors
This simply returns all of the information from the page that throws the exception. In a typical error page, you would use some or all of the information and make a user-friendly representation of it to display to the users. Because you can write arbitrary XQuery in the error page, you can do a wide variety of things, including sending an email to the application administrator, redirecting it to a different page, and so on.
This section describes how to use the HTTP Server URL Rewriter feature. For additional information on URL rewriting, see Creating an Interpretive XQuery Rewriter to Support REST Web Services.
You can access any MarkLogic Server resource with a URL, which is a fundamental characteristic of Representational State Transfer (REST) services. In its raw form, the URL must either reflect the physical location of the resource (if a document in the database), or it must be of the form:
Users of web applications typically prefer short, neat URLs to raw query string parameters. A concise URL, also referred to as a 'clean URL,' is easy to remember, and less time-consuming to type in. If the URL can be made to relate clearly to the content of the page, then errors are less likely to happen. Also crawlers and search engines often use the URL of a web page to determine whether or not to index the URL and the ranking it receives. For example, a search engine may give a better ranking to a well-structured URL such as:
In a 'RESTful' environment, URLs should be well-structured, predictable, and decoupled from the physical location of a document or program. When an HTTP server receives an HTTP request with a well-structured, external URL, it must be able to transparently map that to the internal URL of a document or program.
The URL Rewriter feature allows you to configure your HTTP App Server to enable the rewriting of external URLs to internal URLs, giving you the flexibility to use any URL to point to any resource (web page, document, XQuery program and arguments). The URL Rewriter implemented by MarkLogic Server operates similarly to the Apache mod_rewrite module, only you write an XQuery program to perform the rewrite operation.
The URL rewriting happens through an internal redirect mechanism so the client is not aware of how the URL was rewritten. This makes the inner workings of a web site's address opaque to visitors. The internal URLs can also be blocked or made inaccessible directly if desired by rewriting them to non-existent URLs, as described in Prohibiting Access to Internal URLs.
For information about creating a URL rewriter to directly invoke XSLT stylesheets, see Invoking Stylesheets Directly Using the XSLT Rewriter in the XQuery and XSLT Reference Guide.
If your application code is in a modules database, the URL rewriter needs to have permissions for the default App Server user (nobody by default) to execute the module. This is the same as with an error handler that is stored in the database, as described in Execute Permissions Are Needed On Error Handler Document for Modules Databases.
The examples in this chapter assume you have the Shakespeare plays in the form of XML files loaded into a database. The easiest way to load the XML content into the
Documents database is to do the following:
Runto run the query.
xquery version "1.0-ml"; import module namespace ooxml= "http://marklogic.com/openxml" at "/MarkLogic/openxml/package.xqy"; xdmp:set-response-content-type("text/plain"), let $zip-file := xdmp:document-get("http://www.ibiblio.org/bosak/xml/eg/shaks200.zip") return for $play in ooxml:package-uris($zip-file) where fn:contains($play , ".xml") return (let $node := xdmp:zip-get ($zip-file, $play) return xdmp:document-insert($play, $node) )
bill, assign it port
billas the root directory, and
Documentsas the database.
mac.xqy, that uses the fn:doc function to call the
macbeth.xmlfile in the database:
macbeth.xml(in raw XML format):
url_rewrite.xqythat uses the xdmp:get-request-url function to read the URL given by the user and the fn:replace function to convert the
/macbethportion of the URL to
url_rewrite.xqyscript in the
billApp Server and specify
Though the URL is converted by the fn:replace function to
/Macbeth is displayed in the browser's URL field after the page is opened.
The xdmp:get-request-url function returns the portion of the URL following the scheme and network location (domain name or host_name:port_number). In the above example, xdmp:get-request-url
You can create more elaborate URL rewrite modules, as described in Creating URL Rewrite Modules and Creating an Interpretive XQuery Rewriter to Support REST Web Services.
This section describes how to create simple URL rewrite modules. For more robust URL rewriting solutions, see Creating an Interpretive XQuery Rewriter to Support REST Web Services.
You can use the pattern matching features in regular expressions to create flexible URL rewrite modules. For example, you want the user to only have to enter
/ after the scheme and network location portions of the URL (for example,
http://localhost:8060/) and have it rewritten as
xquery version "1.0-ml"; let $url := xdmp:get-request-url() return fn:replace($url,"^/$", "/mac.xqy")
let $url := xdmp:get-request-url() return fn:replace($url, "^/product-([0-9]+)\.html$", "/product.xqy?id=$1")
let $url := xdmp:get-request-url() return fn:replace($url, "^/product/([a-zA-Z0-9_-]+)/([0-9]+)\.html$", "/product.xqy?id=$2")
The URL Rewriter feature also enables you to block user's from accessing internal URLs. For example, to prohibit direct access to
customer_list.html, your URL rewrite script might look like the following:
let $url := xdmp:get-request-url() return if (fn:matches($url,"^/customer_list.html$")) then "/nowhere.html" else fn:replace($url,"^/price_list.html$", "/prices.html")
/nowhere.html is a non-existent page for which the browser returns a '404 Not Found' error. Alternatively, you could redirect to a URL consisting of a random number generated using xdmp:random or some other scheme that is guaranteed to generate non-existent URLs.
You may encounter problems when rewriting a URL to a page that makes use of page-relative URLs because relative URLs are resolved by the client. If the directory path of the external URL used by the client differs from the internal URL at the server, then the page-relative links are incorrectly resolved.
If you are going to rewrite a URL to a page that uses page-relative URLs, convert the page-relative URLs to server-relative or canonical URLs. For example, if your application is located in
C:\Program Files\MarkLogic\myapp and the page builds a frameset with page-relative URLs, like:
You can use the URL Rewrite trace event to help you debug your URL rewrite modules. To use the URL Rewrite trace event, you must enable tracing (at the group level) for your configuration and set the event:
trace events activated.
ErrorLog.txtfile, indicating the URL received from the client and the converted URL from the URL rewriter:
The trace events are designed as development and debugging tools, and they might slow the overall performance of MarkLogic Server. Also, enabling many trace events will produce a large quantity of messages, especially if you are processing a high volume of documents. When you are not debugging, disable the trace event for maximum performance.
An SGML character entity is a name separated by an ampersand (
& ) character at the beginning and a semi-colon (
; ) character at the end. The entity maps to a particular character. This markup is used in SGML, and sometimes is carried over to XML. MarkLogic Server allows you to control if SGML character entities upon serialization of XML on output, either at the App Server level using the Output SGML Character Entites drop down list or using the
<output-sgml-character-entities> option to the built-in functions xdmp:quote or xdmp:save. When SGML characters are mapped (for an App Server or with the built-in functions), any unicode characters that have an SGML mapping will be output as the corresponding SGML entity. The default is
none, which does not output any characters as SGML entites.
isoamsais an example).
gcedilset is also included (it is not included in the specification).
|SGML Character Mapping Setting||Description|
|The default. No SGML entity mapping is performed on the output.|
|Converts unicode codepoints to SGML entities on output. The conversions are made in the default order. The only difference between |
|Converts unicode codepoints to SGML entities on output. The conversions are made in an order that favors math-related entities. The only difference between |
|Converts unicode codepoints to SGML entities on output. The conversions are made in an order favoring entities commonly used by publishers. The only difference between |
In general, the
<repair>full</repair> option on xdmp:document-load and the
"repair-full" option on xdmp:unquote do the opposite of the Output SGML Character Entites settings, as the ingestion APIs map SGML entities to their codepoint equivalents (one or more codepoints). The difference with the output options is that the output options perform only single-codepoint to entity mapping, not multiple codepoint to entity mapping.
For details, see the MarkLogic XQuery and XSLT Function Reference for these functions.
By default, MarkLogic Server outputs content in utf-8. You can specify a different output encodings, both on an App Server basis and on a per-query basis. This section describes those techniques, and includes the following parts:
You can set the output encoding for an App Server using the Admin Interface or with the Admin API. You can set it to any supported character set (see Collations and Character Sets By Language in the Encodings and Collations chapter of the Search Developer's Guide).
For details, see the MarkLogic XQuery and XSLT Function Reference for these functions.
This configuration page allows you to specify defaults that correspond to the XSLT output options (http://www.w3.org/TR/xslt20#serialization) as well as some MarkLogic-specific options. For details on these options, see xdmp:output in the XQuery and XSLT Reference Guide. For details on configuring default options for an App Server, see Setting Output Options for an HTTP Server in the Administrator's Guide.