Skip to main content

Securing MarkLogic Server

How Query Rolesets Work

When you add a document into MarkLogic Server, it parses the document and puts “terms” (or keys) into the universal index. Later when you run a query, the query side needs to know what terms to find in the universal index. In element level security, the terms are combined with permissions in the index. Existing query rolesets are automatically used by the query to figure out which terms to use, based on the role(s) of the user running the query. Each query can include multiple query rolesets. If no query rolesets are configured, a query will only match documents using the terms that are visible to everyone.

Let’s use an example. Say you have a protected path defined as the following:

sec:protect-path("/root/bar[@baz=1]", (), (xdmp:permission("els-role-2", "read")))

And then you ingest a document like this:

<root>
  <bar baz=1>Hello</bar>
</root>

When MarkLogic Server parses the document, it sees that the word “Hello” is inside the element <bar> that matches the protected path definition (since bar is under root and has an attribute baz=1). So instead of simply putting the term “Hello” into the universal index, it combines the term “Hello” and the permission information in the protected path (in this case, basically the role name “els-role-2”) into one term and puts this new term into the universal index.

Suppose then you run a search with a query cts:word-query("Hello") with a user that has the els-role-2 role. The query must know this new term to find the document. The query already knows the word “Hello” but how would it know the permission information in the protected path?

This is where the query rolesets are used. You configure query rolesets (with just els-role-2 in this example) and then the query compares that query roleset with the caller’s role. If the caller’s role “matches” the query rolesets, the query will combine that information with the word “Hello” to generate the term, which matches the term put into the universal index by MarkLogic Server.

There are three ways to configure query rolesets:

This last method of manually creating query rolesets works for simple examples and cases where there are not many protected paths. If you have a single protected path that matches an element like one in the examples above (with no overlaps), use a simple rule to create the query roleset in the Admin Interface. See Add Protected Paths and Query Rolesets for details.

The two helper functions; xdmp:database-node-query-rolesets() and xdmp:node-query-rolesets(), can help with configuring more complex query rolesets, either for documents already stored in MarkLogic Server or while documents are being added. MarkLogic Server leaves query rolesets configuration (creating and inserting the query rolesets into the Security database) to the administrator.

Query rolesets are made up of roles. There can be any number of roles in a roleset, as long as there are no duplicates. There can be multiple query rolesets in a database:

Diagram showing query rolesets made of other query rolesets and roles

Query rolesets are required for element level security to work. You may ask why not just get the query rolesets information automatically from the protected paths when you configure sec:protect-path() to avoid the manual configuration of query rolesets. For this simple example this seems practical, but in the real world it is not uncommon to have multiple protected paths that match the same node or element. Some use cases will have 1000s of protected paths but only 100s of query rolesets. The indexer side of MarkLogic Server often needs to combine multiple query rolesets to create the term.

There is no way for the query side to derive that information from the protected path configuration, since whether a node element matches a protected path is based on the “value” of the node. And the query side doesn’t know the value of a node. There is no way for the query side to know what subsets of all the configured protected paths need to be taken into consideration when creating the query term. Since enumerating all possible combinations of the roles used in all protected paths is not practical, MarkLogic Server leaves query rolesets configuration (creating and inserting the query rolesets into the Security database) to the administrator.