Loading TOC...
Matches for cat:guide (cat:guide (cat:guide/search-dev)) have been highlighted. remove
Search Developer's Guide (PDF)

MarkLogic Server 11.0 Product Documentation
Search Developer's Guide
— Chapter 16

Creating Alerting Applications

This chapter describes how to create alerting applications in MarkLogic Server as well as describes the components of alerting applications, and includes the following sections:

Overview of Alerting Applications in MarkLogic Server

An alerting application is used to notify users when new content is available that matches a predefined (and usually stored) query. MarkLogic Server includes several infrastructure components that you can use to create alerting applications that have very flexible features and perform and scale to very large numbers of stored queries.

A sample alerting application, which uses the Alerting API, is available as an open source project on github (https://github.com/marklogic/alerting). The sample application has all of the low-level components needed in many enterprise-class alerting applications, but it is packaged in a sample application with a user interface designed to demonstrate the functionality of an alert application; your own applications would likely have a very different and more powerful user interface. Also, the sample application is for demonstration purposes only, and is not designed to be put into production; see the samples-license.txt file for more information. If you do not care about understanding the low-level components of an alerting application, you can skip to the sections of this chapter about the Alerting API and the sample application, Alerting API and Alerting Sample Application.

The heart of the components for alerting applications is the ability to create reverse queries. A reverse query (cts:reverse-query) is a cts:query that returns true if the node supplied to the reverse query would match a query if that query were run in a search. For more details about cts:reverse-query, see cts:reverse-query Constructor.

Alerting applications use reverse queries and several other components, including serialized cts:query constructors, reverse query indexes, MarkLogic Server security components, the Alert API, Content Processing Framework domains and pipelines, and triggers.

cts:reverse-query Constructor

The cts:reverse-query constructor is used in a cts:query expression. It returns true for cts:query nodes that match an input. For example, consider the following:

let $node := <a>hello there</a>
let $query := <xml-element>{cts:word-query("hello")}</xml-element>
return
cts:contains($query, cts:reverse-query($node))
(: returns true :)

This query returns true because the cts:query in $query would match $node. In concept, the cts:reverse-query constructor is the opposite of the other cts:query constructors; while the other cts:query constructors match documents to queries, the cts:reverse-query constructor matches queries to documents. This functionality is the heart of an alerting application, as it allows you to efficiently run searches that return all queries that, if they were run, would match a given node.

The cts:reverse-query constructor is fully composable; you can combine the cts:reverse-query constructor with other constructors, just like you can any other cts:query constructor. The Alerting API abstracts the cts:reverse-query constructor from the developer, as it generates any needed reverse queries. For details about how cts:query constructors work, see Composing cts:query Expressions.

XML Serialization of cts:query Constructors

A cts:query expression is used in a search to specify what to search for. A cts:query expression can be very simple or it can be arbitrarily complex. In order to store cts:query expressions, MarkLogic Server has an XML representation of a cts:query. Alerting applications store the serialized XML representation or cts:query expressions and index them with the reverse index. This provides fast and scalable answers to searches that ask what queries match this document. Storing the XML representation of a cts:query in a database is one of the components of an alerting application. The Alerting API abstracts the XML serialization from the developer. For more details about serializing a cts:query to XML, see the Serializations of cts:query Constructors section of the chapter Composing cts:query Expressions.

Security Considerations of Alerting Applications

Alerting applications typically allow individual users to create their own criteria for being alerted, and therefore there are some inherent security requirements in alerting applications. For example, you don't want everyone to be alerted when a particular user's alerting criteria is met, you only want that particular user alerted. This section describes some of the security considerations and includes the following parts:

Alert Users, Alert Administrators, and Controlling Access

Because there is both a need to manage an alerting application and a need for users of the alerting application to have some ability to perform actions on the database, alerting applications need to manage security. Users of an alerting application need to run some queries that they might not be privileged to run. For example, they need to look at configuration information in a controlled way. To manage this, alerting applications can use amps to allow users to perform operations for which they do not have privileges by providing the needed privileges only in the context of the alerting application. For details about amps and the MarkLogic Server security model, see the Security Guide guide.

The Alerting API, along with the built-in roles alert-admin and alert-user, abstracts all of the complex security logic so you can create a applications that properly deal with security, but without having to manage the security yourself.

Predefined Roles for Alerting Applications

There are two pre-defined roles designed for use in alerting applications that are built using the Alerting API, as well as some internal roles that the Alerting API uses:

Alert-Admin Role

The alert-admin role is designed to give administrators of an alerting applications all of the privileges that are needed to create configurations (alert configs) with the Alerting API. It has a significant amount of privileges, including the ability to run code as any user that has a rule, so only trusted users (users who are assumed to be non-hostile, appropriately trained, and follow proper administrative procedures) should be granted the alert-admin role. Assign the alert-admin role to administrators of your alerting application.

Alert-User Role

The alert-user role is a minimally privileged role. It is used in the Alerting API to allow regular alert users (as opposed to alert-admin users) to be able to execute code in the Alerting API. Some of that code needs to read and update documents used by the alerting application (configuration files, rules, and so on), and this role provides a mechanism for the Alerting API to give the access needed (and no more access) to users of an alerting application.

The alert-user role only has privileges that are used by the Alerting API; it does not provide execute privileges to any functions outside the scope of the Alerting API. The Alerting API uses the alert-user role as a mechanism to amp more privileged operations in a controlled way. It is therefore reasonably safe to assign this role to any user whom you trust to use your alerting application.

Roles For Internal Use Only

There are also two other roles used by the Alerting API which you should not explicitly grant to any user or role: alert-internal and alert-execution. These roles are used to amp special privileges within the context of certain functions of the Alerting API, and giving these roles to any users would give them privileges on the system that you might not want them to have; do not grant these roles to any users.

Indexes for Reverse Queries

You enable or disable the reverse query index in the database configuration by setting the fast reverse searches index setting to true:

The fast reverse searches index speeds up searches that use cts:reverse-query. For alerting applications to scale to large numbers of rules, you should enable fast reverse searches.

Alerting API

The Alerting API is designed to help you build a robust alerting application. The API handles the details for security in the application, as well as provides mechanisms to set up all of the components of an alerting application. It is designed to make it easy to use triggers and CPF to keep the state of documents being alerted. This section describes the Alerting API and includes the following parts:

The Alerting API is implemented as an XQuery library module. For the individual function signatures and descriptions, see the MarkLogic XQuery and XSLT Function Reference.

Alerting API Concepts

There are three main concepts to understand when using the Alerting API:

Alert Config

The alert config is the XML representation of an alerting configuration for an alerting application. Typically, an alerting application needs only one alert config, although you can have many if you need them. The Alerting API defines an XML representation of an alert config, and that XML representation is returned from the alert:make-config function. You then persist the config in the database using the alert:config-insert function. The Alerting API also has setter and getter functions to manipulate an alert config. The alert config is designed to be created and updated by an administrator of the alerting application, and therefore users who manipulate the alert config must have the alert-admin role.

Actions to Execute When an Alert Fires

An action is some XQuery or JavaScript code to execute when an alert occurs. An action could be to update a document in the database, to send an email, or whatever makes sense for your application. The action is an XQuery main module, and the Alerting API defines an XML representation of an action, and that XML representation is returned from the alert:make-action function. The action XML representation points to the XQuery main module that performs the action. You then persist this XML representation of an alert action in the database using the alert:action-insert function. The Alerting API also has setter and getter functions to manipulate an alert action. Alert actions are designed to be created and updated by an administrator of the alerting application, and therefore users who manipulate alert actions must have the alert-admin role.

Alert actions are invoked or spawned with alert:invoke-matching-actions or alert:spawn-matching-actions, and the actions can accept the following external variables:

declare namespace alert = "http://marklogic.com/xdmp/alert";

declare variable $alert:config-uri as xs:string external;
declare variable $alert:doc as node() external;
declare variable $alert:rule as element(alert:rule) external;
declare variable $alert:action as element(alert:action) external;

These external variables are available to the action if it needs to use them. To use the variables, the above variable declarations must be in the prolog of the action module that is invoked or spawned.

Rules For Firing Alerts

A rule is the criteria for which a user is alerted combined with a reference to an action to perform if that criteria is met. For example, if you are interested in any new or changed content that matches a search for jelly beans, you can define a rule that fires an alert when a new or changed document comes in that has the term jelly beans in it. This might translate into the following cts:query:

cts:word-query("jelly beans")

The rule also has an action associated with it, which will be performed if the document matches the query. Alerting applications are designed to support very large numbers of rules with fast, scalable performance. The amount of work for each rule also depends on what the action is for each rule. For example, if you have an application that has an action to send an email for each matching rule, you must consider the impact of sending all of those emails if you have large numbers of matching rules.

The Alerting API defines an XML representation of a rule, and that XML representation is returned from the alert:make-rule function. You then persist the rule in the database using the alert:rule-insert function. Rules are designed to be created and updated by regular users of the alerting application. The Alerting API also has setter and getter functions to manipulate an alert rule. Because those regular users who create rules must have the needed privileges and permissions to perform certain tasks (such as reading and updating certain documents), a minimal set of privileges are required to insert a rule. Therefore users who create rules in an alerting application must have the alert-user role, which has a minimum set of privileges.

Using the Alerting API

Once you understand the concepts described in the previous section, using the Alerting API is straight-forward. This section describes the following details of using the Alerting API:

Set Up the Configuration (User With alert-admin Role)

The first step in using the Alerting API is to create an alert config. For details about an alert config, see Alert Config. You should create the alert config as an alerting application administrator (a user with the alert-admin role or the admin role). The following sample code demonstrates how to create an alert config:

(: run this a user with the alert-admin role :)
xquery version "1.0-ml";
import module namespace alert = "http://marklogic.com/xdmp/alert" 
		  at "/MarkLogic/alert.xqy";

let $config := alert:make-config(
      "my-alert-config-uri",
      "My Alerting App",
      "Alerting config for my app",
        <alert:options/> )
return
alert:config-insert($config)
Set Up Actions (User With alert-admin Role)

An alerting application administrator must also set up actions to be performed when an alert occurs. An action is an XQuery main module and can be arbitrarily simple or arbitrarily complex. Alert actions can perform any action you can write in XQuery. For details about alert actions, see Actions to Execute When an Alert Fires.

In practice, setting up an alerting action requires a good understanding of what you are trying to accomplish in an alerting application. The following is an extremely simple action that sends a log message to the error log.

xdmp:log(fn:concat(xdmp:get-current-user(), " was alerted"))

You must install your action implementation in the modules database associated with your App Server. Once the implementation is installed, you can register it using the XQuery functions alert:make-action and alert:action-insert or the ServerSide JavaScript functions alert.makeAction and alert.actionInsert.

The following procedure outlines the steps for creating, installing, and registering an alerting action:

  1. Implement your action in an XQuery library module. For example, the following is a simple action that logs a message to the error log:
    xquery version "1.0-ml";
    xdmp:log(fn:concat(xdmp:get-current-user(), " was alerted"))
  2. Install your action module in the modules database associated with your App Server. For example, if your logging action is stored in a filesystem file with the path /my/action/log.xqy, then the following code installs it in the modules database with the URI /alerts/log.xqy when you run it against your App Server.
    xquery version "1.0-ml";
    xdmp:eval(
      'xdmp:document-load("/my/action/log.xqy",
          map:map() => map:with("uri", "/alerts/log.xqy")
                    => map:with("format", "text"))',
      (), map:map() => map:with("database", xdmp:modules-database())
    )
  3. Associate your module with an alerting action using the XQuery function alert:action-insert or the Server-Side JavaScript function alert.actionInsert. This step must be performed as a user with the alert-admin role. For example:
    xquery version "1.0-ml";
    import module namespace alert = "http://marklogic.com/xdmp/alert" 
        at "/MarkLogic/alert.xqy";
    
    let $action := alert:make-action(
        "logalert", 
        "log to ErrorLog.txt",
        xdmp:modules-database(),
        xdmp:modules-root(), 
        "/alerts/log.xqy",
        <alert:options>put anything here</alert:options> )
    return
    alert:action-insert("my-alert-config-uri", $action)

You can also create and insert an action using the REST Management API. For details, see POST /manage/v2/databases/{id|name}/alert/actions.

For a more complex example of an alert logging action, see MARKLOGIC_INSTALL_DIR/Modules/MarkLogic/alert/log.xqy in your MarkLogic installation.

Create Rules (Users With alert-user Role)

To create a rule, use the XQuery functions alert:make-rule and alert:rule-insert, or the Server-Side JavaScript functions alert.makeRule and alert.ruleInsert.

You should set up the alerting application so that regular users of the application can create rules. You might have a form, for example, to assist users in creating the rules.

The following example inserts a rule named simple that will fire the action named logalert whenever the specified word query matches. (See Set Up Actions (User With alert-admin Role) for the implementation of the logalert action.) You must run this code as a user with the alert-user role or equivalent privileges. Note that equivalent production code will usually be much more complex, as this example has no user interface.

xquery version "1.0-ml";
import module namespace alert = "http://marklogic.com/xdmp/alert" 
		  at "/MarkLogic/alert.xqy";

let $rule := alert:make-rule(
    "simple", 
    "hello world rule",
    0, (: equivalent to xdmp:user(xdmp:get-current-user()) :)
    cts:word-query("hello world"),
    "logalert",
    <alert:options/> )
return
alert:rule-insert("my-alert-config-uri", $rule)

If your action performs any privileged activities, including reading or creating documents in the database, you will need to add the appropriate execute privileges and URI privileges to users running the application.

Run the Rules Against Content

To make the application fire alerts (that is, execute the actions for rules), you must run the rules against some content. You can do this in several ways, including setting up triggers with the Alerting API (alert:create-triggers), using CPF and the Alerting pipeline, or creating your own mechanism to run the rules against content.

To run the rules manually, you can use the alert:spawn-matching-actions or alert:invoke-matching-actions APIs. These are useful to run alerts in any context, either within an application or as an easy way to test your rules. The alert:spawn-matching-actions is good when you have many alerts that might fire at once, because it will spawn the actions to the task server to execute asynchronously. The alert:invoke-matching-actions API runs the action immediately, so be careful using this if there can be large numbers of matching actions, as they will all be run in the same context. You can run these APIs as any user, and whether or not they produce an action will depend upon what each rule's owner has permissions to see. The following is a very simple example that fires the previously created alert:

xquery version "1.0-ml";
import module namespace alert = "http://marklogic.com/xdmp/alert" 
  at "/MarkLogic/alert.xqy";

alert:invoke-matching-actions("my-alert-config-uri", 
      <doc>hello world</doc>, <options/>)

If you created the config, action, and rule as described in the previous sections, this logs the following to your ErrorLog.txt file when running the code as a user named some-user who has the alert-user role (assuming this user created the rule):

some-user was alerted

If you have very large numbers of alerts, and if the actions for your rules are resource-intensive, invoking or spawning matching actions can produce a significant amount of query activity for your system. This is OK, as that is the purpose of an alerting application, but you should plan your resources accordingly.

Using CPF With an Alerting Application

It is a natural fit to use alerting applications built using the Alerting API with the Content Processing Framework (CPF). CPF is designed to keep state for documents, so it is easy to use CPF to keep track of when a document in a particular scope is created or updated, and then perform some action on that document. For alerting applications, that action involves running a reverse query on the changed documents and then firing alerts for any matching rules (the Alerting API abstracts the reverse query from the developer).

To simplify using CPF with alerting applications, there are pre-built pipelines for alerting. The pipelines are designed to be used with an alerting application built with the Alerting API. This Alerting CPF application will run alerts on all new and changed content within the scope of the CPF domain to which the Alerting pipeline is attached. The Alerting pipleine is suitable for most alerting applications. The Alerting (spawn) pipeline spawns the actions in separate tasks, and therefore will result in increased parallelism which is beneficial if you have many actions that result from a single document change. For example, if you have an application that allows users to specify a query to alert on, and if many people specify the same query (for example, the name of a popular singer), then with the Alerting pipeline, each of those actions is run serially, and the actions will recover even if there is a failure in the middle of them; with the Alerting (spawn) pipeline, each action is spawned as a separate request, allowing more parallelism, but if there is a failure during the actions, the actions will not restart. Furthermore, if your alerting action updates the document being alerted on, then you must use the Alerting (spawn) pipeline, as the Alerting pipeline would result in a deadlock.

If you use the Alerting pipelines with any of the other pipelines included with MarkLogic Server (for example, the conversion pipelines and/or the modular documents pipelines), the Alerting pipeline is defined to have a priority such that it runs after all of the other pipelines have completed their processing. This way, alerts happen on the final view of content that runs through a pipeline process. If you have any custom pipelines that you use with the Alerting pipeline, consider adding priorities to those pipelines so the alerting occurs in the order in which you are expecting.

When you use mlcp to import documents without extensions, CPF alerts do not work.

To set up a CPF application that uses alerting, perform the following steps:

  1. Enable the reverse index for your database, as described in Indexes for Reverse Queries.
  2. Set up the alert config and alert actions as a user with the alert-admin role, as described in Set Up the Configuration (User With alert-admin Role) and Set Up Actions (User With alert-admin Role).
  3. Set up an application to have users (with the alert-user role) define rules, as described in Create Rules (Users With alert-user Role).
  4. Install Content Processing in your database, if it is not already installed (Databases > database_name > Content Processing > Install tab).
  5. Set up the domain scope for a domain.
  6. Attach the Alerting pipeline and the Status Change Handling pipeline to the domain. You can also attach any other pipelines you need to the domain (for example, the various conversion pipelines).
  7. Use the alert:config-set-cpf-domain-names function to notify the alerting configuration of the domain so the alerting action can determine which alerting configuration to use.

    For example, if your CPF domain name is Default Documents, you could do the following.

    alert:config-insert(
       alert:config-set-cpf-domain-names(
          alert:config-get($config-uri),
          ("Default Documents")))

    An alerting configuration can be used with multiple CPF domains, in which case you set a sequence of multiple domain names or IDs.

Any new or updated content within the domain scope will cause all matching rules to fire their corresponding action. If you will have many alerts that are spawned to the task server, make sure the task server is configured appropriately for your machine. For example, if you are running on a machine that has 16 cores, you might want to raise the threads setting for the task server to a higher number then the default of 4. What you set the threads setting depends on what other work is going on your machine.

For details about CPF, see the Content Processing Framework Guide guide.

Alerting Sample Application

A sample alerting application is available on http://developer.marklogic.com/code/alerting. The sample application uses the Alerting API, and has all of the low-level components needed in many enterprise-class alerting applications, but it is packaged in a sample application with a user interface designed to demonstrate the functionality of an alert application; your own applications would likely have a very different and more powerful user interface. This sample code is provided on an as-is basis; the sample code is not intended for production applications and is not supported.

« Previous chapter
Next chapter »