Loading TOC...
Application Developer's Guide (PDF)

MarkLogic Server 11.0 Product Documentation
Application Developer's Guide
— Chapter 17

Creating a Declarative XML Rewriter to Support REST Web Services

The Declarative XML Rewriter serves the same purpose as the Interpretive XQuery Rewriter described in Creating an Interpretive XQuery Rewriter to Support REST Web Services. The XML rewriter has many more options for affecting the request environment than the XQuery rewriter. However, because it is designed for efficiency, XML rewriter doesn't have the expressive power of the XQuery rewriter or access to the system function calls. Instead a select set of match and evaluation rules are available to support a large set of common cases.

The topics in this chapter are:

Overview of the XML Rewriter

The XML rewriter is an XML file that contains rules for matching request values and preparing an environment for the request. If all the requested updates are accepted, then the request precedes with the updated environment, otherwise an error or warning is logged. The XQuery rewriter can only affect the request URI (Path and Query parameters). The XML rewriter, on the other hand, can change the content database, modules database, transaction ID, and other settings that would normally require an eval-into in an XQuery application. In some cases (such as requests for static content) the need for using XQuery code can be eliminated entirely for that request while still intercepting requests for dynamic content.

The XML rewriter enables XCC clients to communicate on the same port as REST and HTTP clients. You can also execute requests with the same features as XCC but without using the XCC library.

Configuring an App Server to use the XML Rewriter

To use an XML rewriter simply specify the XML rewriter (a file with an .xml extension) in the rewriter field for the server configuration of any HTTP server.

For example, the XML rewriter for the App-Services server at port 8000 is located in:

<marklogic-dir>/Modules/MarkLogic/rest-api/8000-rewriter.xml

Input and Output Contexts

The rewriter is invoked with a defined input context. A predefined set of modifications to the context is applied to the output context. These modifications are returned to the request handler for validation and application. The rewriter itself does not directly implement any changes to the input or output contexts.

The topics in this section are:

Input Context

The rewriter input context consists of a combination of matching properties accessible by the match rules described in Match Rules, or global system variables described in System Variables. When a matching rule for a property is evaluated, it produces locally scoped variables for the results of the match, which can be used by child rules.

The properties listed in the table below are available as the context of a match rule. Where "regex" is indicated, a match is done by a regular expression. Otherwise matches are equals of one or more components of the property.

Property / Description Will Support Match by
path regex
param

name

[value]

HTTP header

name

[value]

HTTP method name in list
user name or id
default user is default user
execute privilege in list

Output Context

The output context consists of values and actions that the rewriter is able (and allowed) to perform. These can be expressed as a set of context values and rewriter commands allowed on those values. Any of the output context properties can be omitted, in which case the corresponding input context is not modified. The simple case is no output from the rewriter and no changes to the input context. For example, if the output specifies a new database ID but it is the same as the current database, then no changes are required. The rewriter will not generate any conflicting output context, but it is ultimately up to the request handler to validate the changes for consistency as well as any other constraints, such as permissions. If the rewriter results in actions that are disallowed or invalid, such as setting to a nonexistent database or rewriting to an endpoint to which the user does not have permissions, then error handling is performed.

The input context properties, external path and external query, can be modified in the output context. There are other properties that can be added to the output context, such as to direct the request to a particular database or to set up a particular transaction, as shown in the table below.

Property Description
path* Rewritten path component of the URI
query* Rewritten query parameters
module-database Modules Database
root Modules Root path
database Database
eval True if to evaluate path False for direct access
transaction Transaction ID
transaction mode Specify a query or update transaction mode.
error format Specifies the error format for server generated errors

* These are modified from the input context.

Regular Expressions (Regex)

A common use case for paths in particular is the concept of "Match and Extract" (or "Match / Capture") using a regular expression.

As is the case with the regular expression rules for the fn:replace XQuery function, only the first (non overlapping) match in the string is processed and the rest ignored.

For example given the path shown below, you may want to both match the general form and at the same time extract components in one expression.

/admin/v2/meters/databases/12345/total/file.xqy

The following path match rule regex matches the above path and also extracts the desired components ("match groups") and sets them into the local context as numbered variables, as shown in the table below.

<match-path matches="/admin/v(.)/([a-z]+)/([a-z]+)/([0-9]+)/([a-z]+)/.+\.xqy">
Variable Value
$0 /admin/v2/meters/databases/12345/total/file.xqy
$1 2
$2 meters
$3 databases
$4 12345
$5 total

The extracted values could then be used to construct output values such as additional query parameters.

No anchors (^ .....$) are used in this example, so the expression could also match a string, such as the one below and provide the same results.

somestuff/admin/v2/meters/databases/12345/total/file.xqy/morestuff 

Wherever a rule that matches a regex (indicated by the matches attribute) a flags option is allowed. Only the "i" flag (case insensitive) is currently supported.

Match Rules

Match rules control the evaluator execution flow. They are evaluated in several steps:

  1. An Eval is performed on the rule to determine if it is a match
  2. If it is a match, then the rule may produce zero or more "Eval Expressions" (local variables $*,$0 ... $n)
  3. If it is a match then the evaluator descends into the match rule, otherwise the match is considered "not match" and the evaluator continues onto the next sibling.
  4. If this is the last sibling then the evaluator "ascends" to the parent

Descending: When descending a match rule on match the following steps occur:

  1. If "scoped" (attribute scoped=true) the current context (all in-scope user-defined variables and all currently active modification requests) is pushed.
  2. Any Eval Expressions from the parent are cleared ($*,$0..$n) and replaced with the Eval Expressions produced by the matching node.
  3. Evaluation proceeds at the first child node.

Ascending: When Ascending (after evaluating the last of the siblings) the evaluator Ascends to the parent node. The following steps occur:

  1. If the parent was scoped (attribute scoped=true) then the current context is popped and replaced by the context of the parent node. Otherwise the context is left unchanged.
  2. Eval Expressions ($*,$0...) popped and replaced with the parents in-scope eval expressions.

    This is unaffected by the scoped attribute, Eval expressions are always scoped to only the immediate children of the match node that produced them.)

  3. Evaluation proceeds at the next sibling of the parent node.

    Ascending is a rare case and must be avoided, if possible.

The table below summarizes the match rules. A detailed description of each rule follows.

Element Description
rewriter Root element of the rewriter rule tree.
match-accept Matches on an HTTP Accept header
match-content-type Matches on an HTTP Content-Type header
match-cookie Match on a cookie
match-execute-privilege Match on the users execute privileges
match-header Match on an HTTP Header
match-method Match on the HTTP Method
match-path Match on the request path
match-role Match on the users assigned roles
match-string Matches a string value against a regular expression
match-query-param Match on a uri parameter (query parameter)
match-user Match on a user name, id or default user

rewriter

Root element of the rewriter rule tree.

Attributes: none

Example:

Simple rewriter that redirects anything to /home/** to the module gohome.xqy otherwise passes through the request

<rewriter xmlns="http://marklogic.com/xdmp/rewriter">
     <match-path prefix="/home/">
           <dispatch>gohome.xqy</dispatch>
     </match-path>
</rewriter>  

match-accept

Matches on the Accept HTTP Header.

Attributes

Name Type Required Purpose
@any-of list of strings yes Matches if the Accept header contains any of media types specified.
@scoped boolean

no

default false

Indicates this rule creates a new "scope" context for its children.
@repeated boolean

no

default false

If false then repeated matches are an immediate error.

Child Context Modifications

Variable Type Value
$0 string The media types matched as a string
$* list of strings The media types matched as a List of String

The match is performed as a case sensitive match against the literal strings of the type/subtype. No evaluation of expanding subtype, media ranges or quality factors are performed.

Example:

Dispatch to /handle-text.xqy if the media types application/xml or text/plain are specified in the Accept header.

<match-accept any-of="application/xml text/html">
      <dispatch>/handle-text.xqy</dispatch>
</match-accept>

match-content-type

Matches on the Content-Type HTTP Header.

Attributes

Name Type Required Purpose
@any-of list of strings yes Matches if the Content-Type header contains any of types specified.
@scoped boolean

no

default false

Indicates this rule creates a new "scope" context for its children.

Child Context Modifications

Variable Type Value
$0 string The first types matched as a string.

The match is performed as a case sensitive match against the literal strings of the type/subtype. No evaluation of expanding subtype, media ranges or quality factors are performed.

Example:

Dispatch to /handle-text.xqy if the media types application/xml or text/plain are specified in the Content-Type header.

<match-content-type any-of="application/xml text/html">
      <dispatch>/handle-text.xqy</dispatch>
</match-content-type>

match-cookie

Matches on a cookie by name. Cookies are an HTTP Header with a well-known structured format.

Attributes

Name Type Required Purpose
@name string yes

Matches if the cookie of the specified name exists.

Cookie names are matched in a case-insensitive manner.

@scoped boolean

no

default false

Indicates this rule creates a new "scope" context for its children.

Child Context Modifications

Variable Type Value
$0 string The text value of the matching cookie.

Example:

Set the variable $session to the cookie value SESSIONID, if it exists:

<match-cookie name="SESSIONID">
       <set-var name="session">$0</set-var>
       .... 
</match-cookie>

match-execute-privilege

Match on the users execute privileges

Attributes

Name Type Required Purpose
@any-of list of uris no* Matches if the user has at least one of the specified execute privileges
@all-of list of uris no* Matches if the user has all of the specified execute privileges
@scoped boolean

no

default false

Indicates this rule creates a new "scope" context for its children.

* Exactly One of @any-of or @all-of of is required

The execute privilege must be the URI not the name. See the example.

Child Context modifications:

Variable Type Value
$0 string The matching privileges. For more than one match it is converted to a space delimited string
$* list of strings All of the matching privileges as a List of String

Example:

Dispatches if the user has either the admin-module-read or admin-ui privilege.

<match-execute-privilege 
     any-of="http://marklogic.com/xdmp/privileges/admin-module-read 
             http://marklogic.com/xdmp/privileges/admin-ui">
     <dispatch/>
</match-execute-privilege>

In the XML format you can use newlines in the attribute

match-header

Match on an HTTP Header

Attributes

Name Type Required Purpose
@name string yes

Matches if a header exists equal to the name.

HTTP Header names are matched with a case insensitive string equals.

@value string no

Matches if a header exists with the name and value.

The name is compared case insensitive, the value is case sensitive.

@matches regex no Matches by regex
@flags string no

Optional regex flags.

"i" for case insensitive.

@scoped boolean

no

default false

Indicates this rule creates a new "scope" context for its children.
@repeated boolean

no

default false

If false then repeated matches are an error.

Only one of @value or @matches is allowed but both may be omitted.

Child Context modifications:

If @value is specified, then $0 is set to the matching value

If there is no @matches or @value attribute, then $0 is the entire text content of the header of that name. If more than one header matches, then @repeated indicates if this is an error or allowed. If allowed (true), then $* is set to each individual value and $0 to the space delimited concatenation of all headers. If false (default) multiple matches generates an error.

If @matches is specified then, as with match-path and match-string, $0 .. $N are the results of the regex match

Variable Type Value
$0 string The value of the matched header
$1....$N string Each matching group

Example:

Adds a query-parameter if the User agent contains Chrome/78 or Chrome/80:

<match-header name="User-Agent" matches="Chrome/78\.0">
     <add-query-param name="do-Chrome">yes</add-query-param>
     ...
</match-header>

match-method

Match on the HTTP Method

Attributes

Name Type Required Purpose
@any-of string list yes Matches if the HTTP method is one of the values in the list. Method names are Case Sensitive matches.
@scoped boolean

no

default false

Indicates this rule creates a new "scope" context for its children.

At least one method name must be specified.

Child Context modifications: none

The value of the HTTP method is a system global variable, $_method, as described in System Variables.

Example:

Dispatches if the method is either GET HEAD or OPTIONS AND if the user has the execute privilege http://marklogic.com/xdmp/privileges/manage

<match-method any-of="GET HEAD OPTIONS">
     <match-execute-privilege 
         any-of="http://marklogic.com/xdmp/privileges/manage">
         <set-path>/history/endpoints/resources.xqy</set-path>
         <dispatch/>
     </match-execute-privilege>
</match-method>

match-path

Match on the request path. The "path" refers to the "path" component of the request URI as per RFC3986 [https://tools.ietf.org/html/rfc3986 ] . Simply, this is the part of the URL after the scheme, and authority section, starting with a "/' (even if none were given) up to but not including the Query Param separator "?" and not including any fragment ("#").

The Path is NOT URL Decoded for the purposes of match-path, but query parameter values are decoded (as per HTTP specifications). This is intentional so that path components can contain what would otherwise be considered path component separates, and because HTTP specifications make the intent clear that path components are only to be encode when the 'purpose' of the path is ambiguous without encoding, therefore characters in a path are only supposed to be URL encoded in the case they are intended to NOT be considered as path separator components (or reserved URL characters).

For example, the URL:

http://localhost:8040//root%2Ftestme.xqy?name=%2Ftest

is received by the server as the HTTP request:

GET /root%2Ftestme.xqy?name=%2Ftest

This is parsed as:

PATH:  /root%2Ftestme.xqy

Query (name/value pairs decoded) : ( "name" , "/test" )

A match-path can be used to distinguish this from a URL such as:

http://localhost:8040//root/testme.xqy?name=%2Ftest

Which would be parsed as:

PATH:  /root/testme.xqy

For example, <match-path matches="/root([^/].*)"> would match the first URL but not the second, even though they would decode to the same path.

When match results are placed into $0..$n then the default behavior is to decode the results so that in the above case, $1 would be "/testme.xqy". This is to handle consistency with other values which also are in decoded form, in particular when a value is set as a query parameter it is then *URL encoded* as part of the rewriting. If the value was in encoded form already it would be encoded twice resulting in the wrong value.

In the (rare) case where it is not desired for path-match to decode the results after a match the attribute @uri-decode can be specified and set to false.

Attributes

Name Type Required Purpose
@matches regex no* Matches if the path matches the regular expression
@prefix string no* Matches if the path starts with the prefix (literal string)
@flags string no*

Optional regex flags.

"i" for case insensitive.

@any-of list of strings no* Matches if the path is one of the list of exact matches.
@url-decode boolean

no

default true

If true (default) then results are URL Decoded after extracted from the matching part of the path
@scoped boolean

no

default false

Indicates this rule creates a new "scope" context for its children.

Only one of @matches or @prefix or @any-of is allowed.

If supplied, @matches, @prefix, or @any-of must be non-empty.

@flags only applies to @matches (not @prefix or @any-of).

If none of @matches, @prefix or @any-of is specified then match-path matches all paths.

To match an empty path use matches="^$" (not matches="" which will match anything)

To match all paths omit @matches, @prefix and @any-of

Child Context modifications:

Variable Type Value
$0 string

The entire portion of the path that matched. For matches this is the full matching text.

For @prefix this is the prefix pattern.

For @any-of this is which of the strings in the list matched.

$1 ... $N string

Only for @matches.

The value of the numeric match group as defined by the XQuery function fn:replace()

Example:

<match-path
    matches="^/manage/(v2|LATEST)/meters/labels/([^F$*/?&amp;]+)/?$">
       <add-query-param name="version">$1</add-query-param>
       <add-query-param name="label-id">$2</add-query-param>
       <set-path>/history/endpoints/labels-item.xqy</set-path>
       ...
</match-path>

match-query-param

Match on a query parameter.

Query parameters can be matched exactly (by name and value quality) or partially (by only specifying a name match). For exact matches only one name/value pair is matched. For partial matches it is possible to match multiple name/value pairs with the same name when the query parameter has multiple parameters with the same name. The repeated attribute specifies if this is an error or not, the default (false) indicates repeated matching parameters are an error.

Attributes:

Name Type Required Purpose
@name string yes Matches if a query parameter exists with the name
@value string no Matches if a query parameter exists with the name and value.
@scoped boolean

no

default false

Indicates this rule creates a new "scope" context for its children.
@repeated boolean

no

default false

If false then repeated matches are an immediate error.

If a @value parameter that is present but empty is valid and is a match for the presence of query parameter with an empty value.

Child Context modifications:

Variable Type Value
$0 String

The value(s) of the matched query parameter.

If the query parameter has multiple values that matched (due to multiple parameters of the same name) then the matched values are converted to a space delimited String.

$* list of strings A list of all the matched values as in $0 except as a List of String

Example:

If the query param user has the value "admin" verify AND they have execute privilege http://marklogic.com/xdmp/privileges/manage then dispatch to /admin.xqy.

<match-query-param name="user" value="admin">
      <match-execute-privilege 
           any-of="http://marklogic.com/xdmp/privileges/manage">
           <dispatch>/admin.xqy</dispatch>
      /match-execute-privilege>
</match-query-param>    

If the query parameter contains a transaction then set the transaction ID.

<match-query-param name="transaction">
      <set-transaction>$0</set-transaction> 
      ...
</match-query-param>

Test for the existence of an empty query parameter.

For the URI: /query.xqy?a=has-value&b=&c=cvalue

This rule will set the value of "b" to "default" if it is empty.

<match-query-param name="b" value="">
      <set-query-param name="b" value="default"/> 
</match-query-param>

See match-string for an example of multiple query parameters with the same name.

match-role

Match on the users assigned roles

Attributes

Name Type Required Purpose
@any-of list of role names (strings) no* Matches if the user has the at least one of the specified roles
@all-of list of role names (strings) no* Matches if the user has all of the specified roles
@scoped boolean

no

default false

Indicates this rule creates a new "scope" context for its children.

* Exactly One of @any-of or @all-of is required

Child Context modifications:

Variable Type Value
$0 string

For any-of the first role that matched.

Otherwise unset, (if all-of matched, its known what those roles are, the contents of @all-of).

Example:

Matches if the user has both of the roles infostudio-user AND rest-user

<match-role all-of="infostudio-user rest-user">
       ...
</match-role>

match-string

Matches a string expression against a regular expression. If the value matches then the rule succeeds and its children are descended.

This rule is intended to fill in gaps where the current rules are not sufficient or would be overly complex to implement additional regular expression matching to all rules. Avoid using this rule unless it is absolutely necessary.

Attributes

Name Type Required Purpose
@value string yes The value to match against. May be a literal string or may be a single variable expression.
@matches regex string yes Matches if the value matches the regular expression
@flags string no Optional regex flags. "i" for case insensitive.
@scoped boolean

no

default false

Indicates this rule creates a new "scope" context for its children.
@repeated boolean false If false then repeated matches are an error.

Child Context modifications:

Variable Type Value
$0 string The entire portion of the value that matched.
$1 ... $N string The value of the numeric match group

See Regex (Regular Expressions)

Repeated matches: Regular expressions can match multiple non-overlapping portions of a string, if the regex is not anchored to the begin and end of the string.

match-user

Match on a user name or default user.

To match the user, use only one of @name.

To match if the user is the default user, use @default-user.

You can match both by supplying both a @name and a @default-user.

@default-user defaults to false.

If @default-user is false, no check is made for the value of @default-user.

Attributes

Name Type Required Purpose
@name string no* Matches the user name
@default-user boolean (true | false)

no*

default false

If true, matches if the user is the default user. If false, checks to see if @name matches the user name, but does not check the value of @default-user.
@scoped boolean

no

default false

Indicates this rule creates a new "scope" context for its children.

Child Context modifications: None: One of name or default-user available as system variables; See System Variables

Examples:

Matches the default user (note that there is no need to specify the name' attribute in this case):

<match-user default-user="true">
  ...
</match-user>

Matches the non-default user grace:

<match-user default-user="false" name="grace" >
  ...
</match-user>

System Variables

This section describes the predefined system variables that compose the initial input context. These are available in the context of any variable substitution expression.

System variables are used to substitute for the mechanism used by the XQuery rewriter which can get this information (and much more) by calling any of our XQuery APIs. The Declarative rewriter does not expose any API calls so in cases where the values may be needed in outputs they are made available as global variables. There is some overlap in these variables and the Match rules to simplify the case where you simply need to set a value but don't need to match on it. For example the set-database rule may want to set the database to the current modules database (to allow GET and PUT operations on module files). By supplying a system variable for the modules database ($_modules-database) there is no need for a matching rule on modules-database for the sole purpose of extracting the value.

System variables use a reserved prefix "_" to avoid accidental use in current or future code if new variables are added. Overwriting a system variable is only set in the current scope and does not produce changes to the system.

The period (".") is a convention that suggests the idea of property access but is really just part of the variable name. Where variables start with the same prefix but have ".<name>" as a suffix this is a convention that the name without the dot evaluates to the most useful value and the name with the dot specifics a specific property or type for that variable. For example $_database is the database Name, $_database.id is the database ID.

As noted in Variables and Types the actual type of all variable is a String (or List of String), the Type column in the table below is to indicate what range of values is possible for that variable. For example a database id originates as an unsigned long so can be safely used in any expression that expects a number.

Note:

  • [name] means that name is optional.
  • <name> means that name is not a predefined constant but is required
    Variable Type(s) Desc / Notes
    $_cookie.<name> string

    The value of the cookie <name>. Only the text value of the cookie is returned, not the extra metadata (path, domain, expires etc.). If the cookie does not exist evaluates as "" Cookie names matched and compared case insensitive.

    Future: may expose substructure of the cookie header

    $_database[.name] string The name of the content database.
    $_database.id integer The ID of the content database.
    $_defaultuser boolean True if the authenticated user is the default user.
    $_method string HTTP Method name.
    $_modules-database[.name] string Modules database name. If no name is given, the file system is used for the modules.
    $_modules-database.id integer Modules database ID. Set the ID to 0 to use the file system for modules.
    $_modules-root string Modules root path.
    $_path string The HTTP request path Not including the query string.
    $_query-param.<name> list of strings The query parameters matching the name as a list of strings.
    $_request-url string The original request URI, including the path and query parameters.
    $_user[.name] string The user name.
    $_user.id integer The user ID.

Set the filesystem for modules:

<set-database>$_modules-database</set-database>

Set the transaction to the cookie TRANSACTION_ID:

<set-transaction>$_cookie.TRANSACTION_ID</set-transaction>

Evaluation Rules

Eval rules have no effect on the execution control of the evaluator. They are evaluated when reached and only can affect the current context, not control the execution flow.

There are two types of eval rules: Set rules and assign rules.

Set Rules are rules that create a rewriter command (a request to change the output context in some way). Assign rules are rules that set locally scoped variables but do not produce any rewriter commands.

Variable and rewriter commands are placed into the current scope.

Element Description
add-query-param Adds a query parameter (name/value) to the query parameters
set-database Sets the database
set-error-format Sets the error format for system generated errors
set-error-handler Sets the error handler
set-eval Sets the evaluation mode (eval or direct)
set-modules-database Sets the modules database
set-modules-root Sets the modules root path
set-path Sets the URI path
set-query-param Sets a query parameter
set-transaction Sets the transaction
set-transaction-mode Sets the transaction mode
set-var Sets a variable in the local scope
trace Log a trace message

add-query-param

Adds (appends) a query parameter (name/value) to the query parameters

Attributes

Name Type Required Purpose
@name string yes Name of the parameter

Children:

Expression which evaluates to the value of the parameter

An empty element or list will still append a query parameter with an empty value (equivalent to a URL like http://company.com?a= )

If the expression is a List then the query parameter is duplicated once for each value in the list.

Example:

If the path matches then append to query parameters

  • version= the version matched
  • label-id =the label id matched
    <match-path
         matches="^/manage/(v2|LATEST)/meters/labels/([^/?&amp;]+)/?$">
        <add-query-param name="version">$1</add-query-param>
        <add-query-param name="label-id">$2</add-query-param>
    </match-path>

set-database

Sets the Database.

This will change the context Database for the remainder of request.

Attributes

Name Type Required Purpose
@checked

boolean

[ true,1 | false,0]

no If true then the eval-in privilege of the user is checked to verify the change is allowed.

Children:

An expression which evaluates to either a database ID or database name.

It is an immediate error to set the value using an expression evaluating to a list of values.

See Database (Name or ID) for a description of how Database references are interpreted.

Notes on @checked flag.

The @checked flag is interpreted during the rewriter modification result phase, by implication this means that only the last set-database that successfully evaluated before a dispatch is used.

If the @checked flag is true AND if the database is different than the App Server defined database then the user must have the eval-in privilege.

Examples:

Set the database to "SpecialDocuments":

<set-database>SpecialDocuments</set-database> 

Set the database to the current modules database:

<set-database>$_modules-database</set-database> 

set-error-format

Sets the error format used for all system generated errors. This is the format (content-type) of the body of error messages for a non-successful HTTP response.

This overwrites the setting from the application server configuration and takes effect immediately after validation of the rewriter rules have succeeded.

Attributes: None

Children: An expression which evaluates to one of the following error formats.

  • html
  • json
  • xml
  • compatible

The "compatible" format indicates for the system to match as closely as possible the format used in prior releases for the type of request and error. For example, if dispatch indicates "xdbc" then "compatible" will produce errors in the HTML format, which is compatible with XCC client library.

It is an immediate error to set the value using an expression evaluating to a list of values.

This setting does not affect any user defined error handler, which is free to output any format and body.

Example:

Set the error format for json responses

<set-error-format>json </set-database> 

set-error-handler

Sets the error handler

Attributes: None

Children: An expression which evaluates to a Path (non blank String).

Example:

<set-error-handler >/myerror-handler.xqy</set-modules-root>

If error occurs during the rewriting process then the error handler which is associated with the application server is used for error handling. After a successful rewrite if the set-error-handler specifies a new error handler then it will be used for handling errors.

The modules database and modules root used to locate the error handler is the modules database and root in effect at the time of the error.

Setting the error handler to the empty string will disable the use of any user defined error handler for the remainder of the request.

It is an immediate error to set the value using an expression evaluating to a list of values.

For example, if in addition the set-modules-database rule was used, then the new error handler will be search for in the rewritten modules database (and root set with set-modules-root ) otherwise the error handler will be searched for in the modules database configured in the app server.

set-eval

Sets the Evaluation mode (eval or direct).

The Evaluation mode is used in the request handler to determine if a path is to be evaluated (XQuery or JavaScript) or to be directly accessed (PUT/GET).

In order to be able to read and write to evaluable documents (in the modules database), the evaluation mode needs to be set to direct and the Database needs to be set to a Modules database.

Attributes: None

Children: An expression evaluating to either "eval" or "direct"

Example:

Forces a direct file access instead of an evaluation if the filename ends in .xqy

<match-path matches=".*\.xqy$">
     <set-eval>direct</set-eval>
</match-user>

set-modules-database

Sets the Modules database.

This sets the modules database for the request.

Attributes

Name Type Required Purpose
@checked

boolean [ true,1 | false,0]

default false

no If true then the permissions of the user are checked for the eval-in privilege verify the change is allowed.

Children:

An expression which evaluates to either a database ID or database name. An empty value, expression or expression evaluating to "0" indicates "Filesystem", otherwise the value is interpreted as a database Name, or ID.

See Database (Name or ID) for a description of how Database references are interpreted.

It is an immediate error to set the value using an expression evaluating to a list of values.

Notes on @checked flag.

The @checked flag is interpreted during the rewriter modification result phase, by implication this means that only the last set-database that successfully evaluated before a dispatch is used.

If the @checked flag is true AND if the database is different than the App Server defined modules database then the user must have the eval-in privilege.

Example:

Sets the database to "SpecialDocuments"

<match-user name="admin">
      <set-modules-database>SpecialModules</set-modules-database>
      ...
</match-user>

set-modules-root

Sets the modules root path

Attributes: None

Children: An expression which evaluates to a Path (non blank String).

It is an immediate error to set the value using an expression evaluating to a list of values.

Example:

Sets the modules root path to /myapp

<set-modules-root>/myapp</set-modules-root>

set-path

Sets the URI path for the request.

Often this is the primary use case for the rewriter.

Attributes: None

Children:

An expression which evaluates to a Path (non blank String).

It is an immediate error to set the value using an expression evaluating to a list of values.

Example:

If the user name is "admin" then set the path to /admin.xqy

Then if the method is either GET , HEAD, OPTIONS dispatch otherwise if the method is POST then set a query parameter "verified" to true and dispatch.

<match-user name="admin">
    <set-path>/admin.xqy</set-path>
    <match-method any-of="GET HEAD OPTIONS">
        <dispatch/>
    </match-method>
    <match-method any-of="POST">
        <set-query-param name="verified">true</set-query-param>
        <dispatch/>
    </match-method>
</match-user>

See 4.1.5.6.1for a way to set-path and dispatch in the same rule.

set-query-param

Sets (overwrites) a query parameter. If the query parameter previously existed all of its values are replaced with the new value(s).

Attributes

Name Type Required Purpose
@name string yes Name of the parameter

Children

An expression which evaluates to the value of the query parameter to be set. If the expression is a List then the query parameter is duplicated once for each value in the list.

An empty element, empty string value or empty list value will still set a query parameter with an empty value (equivalent to a URL like http://company.com?a= )

Examples:

If the user is admin then set the query parameter user to be admin, overwriting any previous values it may have had.

<match-user name="admin">
     <set-query-param name="user">admin</set-query-param>
</match-user>

Copy all the values from the query param "ids" to a new query parameter "app-ids" replacing any values it may have had.

<match-query-param name="ids">
     <set-query-param name="app-ids">$*</set-query-param>
</match-query-param>

This can be used to "pass through" query parameters by name when @include-request-query-params is specified in the <dispatch> rule.

The following rules will copy all query parameter (0 or more) named "special" to result without passing through other parameters.

<match-query-param name="special" repeated="true">
      <set-query-param name=" special">$*</set-query-param>
</match-query-param>
<dispatch include-request-query-params="false"/>

set-transaction

Sets the current transaction. If specified, set-transaction-mode must also be set.

Attributes: None

Children: An expression which evaluates to the transaction ID.

Example:

Set the transaction to the value of the cookie TRANSACTION_ID.

<set-transaction>$_cookie.TRANSACTION_ID</set-transaction>

If the expression for set-transaction is empty, such as when the cookie doesn't exist, then the transaction is unchanged.

It is an immediate error (during rewriter parsing) to set the value using an expression evaluating to a list of values or to 0.

set-transaction-mode

Sets the transaction mode for the current transaction. If specified, set-transaction must also be set.

Attributes: None

Children: An expression evaluating to a transaction mode specified by exactly one of the strings

("auto" | "query" | "update")

Example:

Set the transaction mode to the value of the query param "trmode" if it exists.

<match-query-param name="trmode">
      <set-transaction-mode>$0</set-transaction-mode>
</match-query-param>

It is an error if the value for transaction mode is not one of "auto," "query," or "update." It is also an error to set the value using an expression evaluating to a list of values.

set-var

Sets a variable in the local scope

This is an Assign Rule. It does not produce rewriter commands instead it sets a variable.

The assignment only affects the current scope (which is the list of variables pushed by the parent). The variable is visible to following siblings as well as children of following siblings.

Allowed user defined variable names must start with a letter and followed by zero or more letters, numbers, underscore or dash.

Specifically the name must match the regex pattern "[a-zA-Z][a-zA-Z0-9_-]*"

This implies that set-var cannot set either system defined variables, property components or expression variables.

Attributes

Name Type Required Purpose
@name string yes Name of the variable to set (without the "$")

Children:

An expression which is evaluated to value to set the variable.

Examples:

Sets the variable $dir1 to the first component of the matching path, and $dir2 to the second component.

<match-path matches="^/([a-z]+)/([a-z]+)/.*">
    <set-var name="dir1">$1</set-var>
    <set-var name="dir2">$2</set-var>
    ...
</match-path

If the Modules Database name contains the string "User" then set the variable usedb to the full name of the Modules DB.

<match-string  value="$_modules-database" matches=".*User.*">
    <set-var name="usedb">$0</set-var>
</match-string>

Matches all of the values of a query parameter named "ids" if any of them is fully numeric.

<match-query-param name="ids">
    <match-string  value="$*" matches="[0-9]+">
          .... 
    </match-string>
</match-query-param>

trace

Log a trace message

The trace rule can be used anywhere an eval rule is allowed. It logs a trace message similar to fn:trace.

The event attribute specifies the Trace Event ID. The body of the trace element is the message to log.

Attributes

Name Type Required Purpose
@event string yes Specifies the trace event

Child Content: Trace message or expression.

Child Elements: None

Child Context modifications: None

Example:

<match-path prefix="/special">
    <trace event="AppEvent1">
        The following trace contains the matched path.
    </trace>
    <trace event="AppEvent2">
        $0
    </trace>
</match-path>

Termination Rules

Termination rules (dispatch, error) unconditionally stop the evaluator at the current rule. No further evaluation occurs. The dispatch rule will return out of the evaluator with all accumulated rewriter commands in scope. The error rule discards all command and returns with the error condition.

Element Description
dispatch Stop evaluation and dispatch with all rewrite commands
error Terminates evaluation with an error

dispatch

Stop evaluation and dispatch with all rewrite commands.

The dispatch element is required as the last child of any match rule which contains no match rules.

Attributes

Name Type Required Purpose
@include-request-query-params

boolean

default true

no If true then the original request query params are used as the initial set of query params before applying any rewrites
@xdbc

boolean

default false

no If true then the built-in XDBC handlers are used for the request.

The attribute include-request-query-params specifies whether the initial request query parameters are included in the rewriter result If true (or absent) then the rewriter modifications start with the initial query parameters and then are augmented (added or reset) by any set-query-param and add-query-param rules which are in scope at the time of dispatch.

If set to false then the initial request parameters are not included and only the parameters set or added by any set-query-param and add-query-param rules are included in the result.

If xdbc is specified and true then the built-in xdbc handlers will be used for the request. If xdbc support is enabled then the final path (possibly rewritten) MUST BE one of the paths supported by the xdbc built-in handlers.

Child Content:

Empty or an expression

Child Elements:

If the child element is not empty or blank then it is evaluated and used for the rewrite path.

Child Context modifications:

Examples:

<set-path>/a/path.xqy
     <dispatch/>
</set-path>

Is equivalent to:

<dispatch>/a/path.xqy</dispatch>

If the original URL is /test?a=a&b=b, the rewriter:

<set-query-param name="a">a1</set-query-param>
<dispatch include-request-query-params="false">/run.xqy</dispatch>

rewrites to path /run.xqy and the query parameters are:

a=a1

The following rewriter:

<set-query-param name="a">a1</set-query-param>
<dispatch>run.xqy</dispatch>

rewrites to path /run.xqy and the query parameters are:

a=a1 
b=b

An example of a minimal rewriter rule that dispatches to XDBC is as follows:

<match-path  any-of="/eval /invoke /spawn /insert">
      <dispatch xdbc="true">$0</dispatch>
<match-path>

error

Terminate evaluation with an error.

The error rule terminates the evaluation of the entire rewriter and returns and error to the request handler. This error is then handled by the request handler, passing to the error-handler if there is one.

The code (optional) optional message data are supplied as attributes.

Attributes:

Name Type Required Purpose
@code string yes Specifies the error code
@data1 string no Error message, first part
@data2 string no Error message, second part
@data3 string no Error message, third part
@data4 string no Error message, fourth part
@data5 string no Error message, fifth part

Child Content:

None

Child Elements:

None

Child Context modifications: none

Example:

<error code="XDMP-BAD" data1="this" data2="that"/>

Simple Rewriter Examples

Some examples of simple rewriters:

Redirect a request by removing the prefix, /dir.

<rewriter xmlns="http://marklogic.com/xdmp/rewriter">
   <match-path matches="^/dir(/.+)">
      <dispatch>$1</dispatch>
   </match-path>
</rewriter>

For GET and PUT requests only, if the a query parameter named path is exactly /admin then redirect to /private/admin.xqy otherwise use the value of the parameter for the redirect.

If no path query parameter then do not change the request

<rewriter xmlns="http://marklogic.com/xdmp/rewriter">
   <match-method any-of="GET PUT">
      <!-- match by name/value -->
      <match-query-param name="path" value="/admin">
         <dispatch>/private/admin.xqy</dispatch>:
      </match-query-param>
      <!-- match by name use value -->
      <match-query-param name="path">
         <dispatch>$0</dispatch>:
      </match-query-param>
   </match-method>
</rewriter>

If a parameter named data is present in the URI then set the database to UserData. If a query parameter module is present then set the modules database to UserModule. If the path starts with /users/ and ends with /version<versionID> then extract the next path component ($1), append it to /app and add a query parameter version with the versionID.

<rewriter xmlns="http://marklogic.com/xdmp/rewriter">
   <match-query-param name="data"> 
      <set-database>UserData</set-database>
   </match-query-param>
   <match-query-param name="module"> 
      <set-modules-database>UserModule</set-modules-database>
   </match-query-param>
   <match-path match="^/users/([^/]+)/version(.+)%">
      <set-path>/app/$1</set-path>
      <add-query-param name="version">$2</add-query-param>
   </match-path>
   <dispatch/>
</rewriter>

Match users by name and default user and set or overwrite a query parameter.

<rewriter xmlns="http://marklogic.com/xdmp/rewriter">
   <set-query-param name="default">
        default-user no match
   </set-query-param>
   <match-user name="admin">
       <add-query-param name="user">admin matched</add-query-param>
   </match-user>
   <match-user name="infostudio-admin">
       <add-query-param name="user">
           infostudio-admin matced
       </add-query-param>
   </match-user>
   <match-user default-user="true">
      <set-query-param name="default">
         default-user matched
      </set-query-param>
   </match-user>
   <dispatch>/myapp.xqy</dispatch>
</rewriter>

Matching cookies. This properly parses the cookie HTTP header structure so matches can be performed reliably. In this example, the SESSIONID cookie is used to conditionally set the current transaction.

<rewriter xmlns="http://marklogic.com/xdmp/rewriter">
   <match-cookie name="SESSIONID">
      <set-transaction>$0</set-transaction>
   </match-cookie>
</rewriter>

User defined variables with local scoping. Set an initial value to the user variable test. If the patch starts with /test/ and contains atleast 2 more path components then reset the test variable to the first matching path, and add a query param var1...Ä≥ to the second matching path. If the role of the user also contains either admin-builtins or app-builder then rewrite to the path '/admin/secret.xqy', otherwise add a query param var2...Ä≥ with the value of the test user variable and rewrite to /default.xqy

If you change the scoped attribute from true to false, (or remove it), then all the changes within that condition are discarded if the final dispatch to /admin/secret.xqy is not reached, leaving intact the initial value for the test variable, not adding the var1...Ä≥ query parameter and dispatching to /default.xqy

<rewriter xmlns="http://marklogic.com/xdmp/rewriter" >
   <set-var name="test">initial</set-var>
   <match-path matches="^/test/(\w+)/(\w+).*" scoped="true">
      <set-var name="test">$1</set-var>
      <set-query-param name="var1">$2</set-query-param>
      <match-role any-of="admin-builtins app-builder">
         <dispatch>/admin/secret.xqy</dispatch>
      </match-role>
   </match-path>
   <add-query-param name="var2">$test</add-query-param>
   <dispatch>/default.xqy</dispatch>
</rewriter>

« Previous chapter
Next chapter »