MarkLogic includes rich full-text search features. All of the search features are implemented as extension functions available in XQuery, and most of them are also available through the REST and Java interfaces. This section provides a brief overview some of the main search features in MarkLogic and includes the following parts:
MarkLogic is designed to scale to extremely large databases (100s of terabytes or more). All search functionality operates directly against the database, no matter what the database size. As part of loading a document, full-text indexes are created making arbitrary searches fast. Searches automatically use the indexes. Features such as the xdmp:estimate XQuery function and the
unfiltered search option allow you to return results directly out of the MarkLogic indexes.
MarkLogic provides search features through a set of layered APIs. The core text search foundations in MarkLogic are the XQuery
cts.* APIs, which are built-in functions that perform full-text search. The XQuery
jsearch.*, and REST APIs above this foundation provide a higher level of abstraction that enable rapid development of search applications. For example, the XQuery
search:* API is built using
cts:* features such as cts:search, cts:word-query, and cts:element-value-query. On top of the REST API are the Java and Node.js Client APIs that enable users familiar with those interfaces access to the MarkLogic search features.
jsearch.*, REST, Java or Node.js APIs are sufficient for most applications. Use the
cts APIs for advanced application features, such as using reverse queries to create alerting applications and creating content classifiers. The higher-level APIs offer benefits such as the following:
You can use more than one of these APIs in an application. For example, a Java application can include an XQuery extension to perform custom search result transformations on the server. Similarly, an XQuery application can call both
Each of the APIs described in Search APIs supports one or more input query styles for searching content and metadata, from simple string queries (
cat OR dog) to XML or JSON representations of complex queries. Search results are returned in either raw or report form. The supported query styles and result format vary by API.
For example, the primary search function for the
cts:* API, cts:search, accepts input in the form of a
cts:query, which is a composable query style that allows you to perform fine-grained searches. The cts:search function returns raw results as a sequence of matching nodes. The
jsearch.*, REST, Java, and Node.js APIs accept more abstract query styles such as string and structured queries, and return results in report form, such as a
search:response XML element. This customizable report can include details such as snippets with highlighting of matching terms and query metrics. The REST, Java, and Node.js APIs can also return the results report as a JSON map with keys that closely correspond to a
|Query Style||Supporting APIs||Description|
|String Query||Construct queries as text strings using a simple grammar of terms, phrases, and operators such as as AND, OR, and NEAR. String queries are easily composable by end users typing into a search text box. For details, see Searching Using String Queries in the Search Developer's Guide.|
|Query By Example||Construct queries in XML or JSON using syntax that resembles your document structure. Conceptually, Query By Example enables developers to quickly search for 'documents that look like this'. For details, see Searching Using Query By Example in the Search Developer's Guide.|
|Structured Query||Construct queries in JSON or XML using an Abstract Syntax Tree (AST) representation, while still taking advantage of Search API based abstractions and options. Useful for tweaking or adding to a query originally expressed as a string query. For details, see Searching Using Structured Queries in the Search Developer's Guide.|
|Combined Query||Search using XML or JSON structures that bundle a string and/or structured query with query options. This enables searching without pre-defining query options as is otherwise required by the REST and Java APIs. For details, see Specifying Dynamic Query Options with Combined Query in REST Application Developer's Guide or Apply Dynamic Query Options to Document Searches in Java Application Developer's Guide|
|Construct queries in XML from low level |
MarkLogic Server implements the XQuery language, which includes XPath 2.0. XPath expressions are searches which can search across the entire database. For example, consider the following XPath expression:
This expression searches across the entire database returning
my-child nodes that match the expression. XPath expressions take full advantage of the indexes in the database and are designed to be fast. XPath can search both XML and JSON documents.
MarkLogic Server has range indexes which index XML and JSON structures such as elements, element attributes, XPath expressions, and JSON keys. There are also range indexes over geospatial values. Each of these range indexes has lexicon APIs associated with them. The lexicon APIs allow you to return values directly from the indexes. Lexicons are very useful in constructing facets and in finding fast counts of element or attribute values. The Search, Java, and REST APIs makes extensive use of the lexicon features. For details about lexicons, see Browsing With Lexicons in the Search Developer's Guide.
You can create applications that notify users when new content is available that matches a predefined query. There is an API to help build these applications as well as a built-in
cts:query constructor (cts:reverse-query) and indexing support to build large and scalable alerting applications. For details on alerting applications, see Creating Alerting Applications in the Search Developer's Guide.
MarkLogic allows you use SPARQL (SPARQL Protocol and RDF Query Language) to do semantic searches on the Triple Index, described in Triple Index. SPARQL is a query language specification for querying over RDF (Resource Description Framework) triples.
MarkLogic supports SPARQL 1.1. SPARQL queries are executed natively in MarkLogic to query either in-memory triples or triples stored in a database. When querying triples stored in a database, SPARQL queries execute entirely against the triple index.
For other information about developing applications in MarkLogic Server, see the Application Developer's Guide.