This chapter describes how to submit searches using the Java API, and includes the following sections:
The MarkLogic Java API provides the following fundamental ways of querying the database:
In addition to typical document searches, you can search Java POJOs that have been stored in the database. For details, see POJO Data Binding Interface.
When you search documents you can express search criteria using one of the following kinds of query:
When you query aggregate range indexes, you express your search criteria using a values query.
All search methods can also use persistent query options. Persistent query options are stored on the REST Server and referenced by name in future queries. Once created and persisted, you can apply query options to multiple searches, or even set to be the default options for all searches. Note that in XQuery, query option configurations are called options nodes.
Some search methods support dynamic query options that you specify at search time. A combined query allows you to bundle a string and/or structured query with dynamic query options to further customize a search on a per search basis. You can also specify persistent query options with a combined query search. The search automatically merges the persistent (or default) query options and the dynamic query options together. For details, see Apply Dynamic Query Options to Document Searches.
Query options can be very simple or very complex. If you accept the defaults, for example, there is no need to specify explicit query options. You can also make them as complex as is needed.
For details on how to create and work with query option configurations, see Query Options. For details on individual query options and their values, see Appendix: Query Options Reference in the Search Developer's Guide. For more information on search concepts, see the Search Developer's Guide.
In the examples in this chapter, assume a DatabaseClient
called client
has already been defined.
Usually, you will use a SearchHandle
object to contain your query results. The exact nature of results varies, depending on both the handle's configuration and what query options and values were used for the search operation.
You can specify snippets to return in various ways. By default, they return as Java objects. But for custom or raw snippets, they are returned as DOM documents by using the forceDOM
flag.
There are several ways to access different parts of the search result or control search results from a SearchHandle
.
getMatchResults()
method returns an array of MatchDocumentSummary
objects of the matched documents, from which you can further extract for each result its match locations, path, metadata, an array of snippets, fitness, confidence measure, and URI. For details, see the MatchDocumentSummary
entry in Java API JavaDoc.getMetrics()
returns a SearchMetrics
object containing various timing metrics about the search.getFacetNames()
, getFacetResult(name)
, getFacetResults()
return, respectively, a list of returned facet names, the specified named facet result, and an array of facet results for this search.getTotalResults()
returns an estimate of the number of results from the search.setForceDOM(boolean)
sets the force DOM flag, which if true
causes snippets to always be returned as DOM documents. See the Java API JavaDoc for SearchHandle for the full interface.
The following is a typical programming technique for accessing search results using a search handle:
// iterate over MatchDOcumentSummary array locations, getting // the snippet text for each location (you would then do something // with the snippet text) MatchDocumentSummary[] summaries = results.getMatchResults(); for (MatchDocumentSummary summary : summaries ) { MatchLocation[] locations = summary.getMatchLocations(); for (MatchLocation location : locations) { location.getAllSnippetText(); // do something with the snippet text } }
The MarkLogic Server Search API lets you do searches on string arguments, including the usual search operators such as AND and OR. For example, you could search on Batman, Batman AND Robin, Batman OR Robin, etc. For details, see Search Grammar in the Search Developer's Guide.
QueryManager
. The manager deals with interaction between the client and the database.QueryManager queryMgr = client.newQueryManager();
StringQueryDefinition
object. Use StringQueryDefinition
.setCriteria()
to specify your search string. StringQueryDefinition qd = queryMgr.newStringDefinition(); qd.setCriteria("Batman AND Robin");
StringQueryDefinition
object as an argument, returning a SearchHandle
object or an XML or JSON handle to get the search results in either of those formats:SearchHandle results = queryMgr.search(qd, new SearchHandle()); DOMHandle results = queryMgr.search(qd, new DOMHandle()); JacksonHandle results = queryMgr.search(qd, new JacksonHandle());
Structured queries let you construct and modify complex queries in Java, XML, or JSON. For details, see Searching Using Structured Queries in the Search Developer's Guide. This section includes the following parts:
You can create a structured query in XML, in JSON, or using the StructuredQueryBuilder
or PojoQueryBuilder
interfaces in the Java API.
To specify a structured query directly in XML or JSON, use RawStructuredQueryDefinition
; for details, see Creating a Structured Query From Raw XML or JSON. If you construct a structured query directly, it is up to you to make sure the query is constructed correctly. Incorrectly constructed queries can result in syntax errors, a query that does not do what you expect, or other exceptions. For syntax details, see Searching Using Structured Queries in the Search Developer's Guide.
The StructuredQueryBuilder
interface in the Java API enables you build out a structured query one piece at a time in Java. The PojoQueryBuilder
interface is similar, but you use it specifically for searching persistent POJOs; for details see Searching POJOs in the Database.
The following are the basic steps needed to define a structured query definition in the Java API. This procedure creates a structured query definition using StructuredQueryBuilder
. You can also create one directly in XML/JSON; for details, see Creating a Structured Query From Raw XML or JSON.
QueryManager
. The manager deals with interaction between the client and the database.QueryManager queryMgr = client.newQueryManager();
StructuredQueryBuilder
, optionally passing in the name of persistent query options to use with your search.StructuredQueryBuilder qb = new StructuredQueryBuilder(OPTIONS_NAME);
StructuredQueryDefinition
object with the desired search criteria.StructuredQueryDefinition querydef = qb.and(qb.term("neighborhood"), qb.valueConstraint("industry", "Real Estate"));
StringQueryDefinition
object as an argument, returning a result handle:SearchHandle results = queryMgr.search(querydef, new SearchHandle());
To create a structured query from a raw XML or JSON representation, use any handle class that implements com.marklogic.client.io.marker.StructureWriteHandle
.
The Java API includes StructureWriteHandle
implementations that support creating a structure in XML or JSON from a string (StringHandle
), a file (FileHandle
), a stream (InputStreamHandle
), and popular abstractions (DOMHandle
, DOM4JHandle
, JDOMHandle
). For a complete list of implementations, see the Java API JavaDoc.
Follow this procedure to create a structured query using a handle:
QueryManager
. The manager deals with interaction between the client and the database.QueryManager queryMgr = client.newQueryManager();
String
for the raw representation:String rawXMLQuery =
"<search:query "+
"xmlns:search='http://marklogic.com/appservices/search'>"+
"<search:term-query>"+
"<search:text>neighborhoods</search:text>"+
"</search:term-query>"+
"<search:value-constraint-query>"+
"<search:constraint-name>industry</search:constraint-name>"+
"<search:text>Real Estate</search:text>"+
"</search:value-constraint-query>"+
"</search:query>";
String rawJSONQuery =
"{\"query\": {" +
" \"term-query\": {" +
" \"text\": \"neighborhoods\"" +
" }," +
" \"value-constraint-query\": {" +
" \"constraint-name\": \"industry\"," +
" \"text\": \"Real Estate\"" +
" }" +
"}" +
"}";
StructureWriteHandle
. Set the handle content format appropriately. For example:// For an XML query StringHandle rawHandle = new StringHandle(rawXMLQuery).withFormat(Format.XML); // For a JSON query StringHandle rawHandle = new StringHandle(rawJSONQuery).withFormat(Format.JSON);
RawStructuredQueryDefinition
from the handle. Optionally, include the name of persistent query options. For example:// Use the default persistent query options RawStructuredQueryDefinition querydef = queryMgr.newRawStructuredQueryDefinition(rawHandle); // Use the persistent options previously saved as "myoptions" RawStructuredQueryDefinition querydef = queryMgr.newRawStructuredQueryDefinition(rawHandle, "myoptions");
RawStructuredQueryDefinition
and a results handle.SearchHandle resultsHandle = queryMgr.search(querydef, new SearchHandle());
This section shows some structured query examples, showing the XML for a structured query and the corresponding Java code using StructuredQueryBuilder
. You can put each of these examples in context by inserting the StructuredQueryDefinition
line in the following code:
QueryManager queryMgr = dbClient.newQueryManager(); StructuredQueryBuilder sb = queryMgr.newStructuredQueryBuilder("myopt"); // put code from examples here StructuredQueryDefinition criteria = ... example of building query definition ... // end code from examples StringHandle searchHandle = queryMgr.search( criteria, new StringHandle()).get();
Additionally, these examples use query options from the following code:
String xmlOptions =
"<search:options " +
"xmlns:search='http://marklogic.com/appservices/search'>" +
"<search:constraint name='date'>" +
"<search:range type='xs:date'>" +
"<search:element name='date' ns='http://purl.org/dc/elements/1.1/'/>" +
"</search:range>" +
"</search:constraint>" +
"<search:constraint name='popularity'>" +
"<search:range type='xs:int'>" +
"<search:element name='popularity' ns=''/>" +
"</search:range>" +
"</search:constraint>" +
"<search:constraint name='title'>" +
"<search:word>" +
"<search:element name='title' ns=''/>" +
"</search:word>" +
"</search:constraint>" +
"<search:return-results>true</search:return-results>" +
"<search:transform-results apply='raw' />" +
"</search:options>";
//JSON equivalant
String jsonOptions =
"{\"options\":{" +
" \"constraint\": [" +
" {" +
" \"name\": \"date\"," +
" \"range\": {" +
" \"type\":\"xs:date\", " +
" \"element\": {" +
" \"name\": \"date\"," +
" \"ns\": \"http://purl.org/dc/elements/1.1/\"" +
" }" +
" }" +
" }," +
" {" +
" \"name\": \"popularity\"," +
" \"range\": {" +
" \"type\":\"xs:int\", " +
" \"element\": {" +
" \"name\": \"popularity\"," +
" \"ns\": \"\"" +
" }" +
" }" +
" }," +
" {" +
" \"name\": \"title\"," +
" \"word\": {" +
" \"element\": {" +
" \"name\": \"title\"," +
" \"ns\": \"\"" +
" }" +
" }" +
" }" +
" ]," +
" \"return-results\": \"true\"," +
" \"transform-results\": {" +
" \"apply\": \"raw\"" +
" }" +
"}}";
QueryOptionsManager optionsMgr =
dbClient.newServerConfigManager().newQueryOptionsManager();
optionsMgr.writeOptions("myopt",
new StringHandle(xmlOptions).withFormat(Format.XML));
// Or, with JsonOptions:
new StringHandle(jsonOptions).withFormat(Format.JSON));
This section contains the following examples:
For the boilerplate code environment in which this example runs, see the code snippet in Structured Query Examples.
The following example defines a query that searches for the "2005-01-01"
value in the date range index.
StructuredQueryDefinition criteria = sb.containerQuery("date", Operator.EQ, "2005-01-01"); /* XML equivalent <search:query xmlns:search= "http://marklogic.com/appservices/search"> <search:range-constraint-query> <search:constraint-name>date</search:constraint-name> <search:value>2005-01-01</search:value> </search:range-constraint-query> </search:query> */ /* JSON equivalent {"query":{ "range-constraint-query": { "constraint-name": "date", "value": "2005-01-01" } } } */
For the boilerplate code environment in which this example runs, see the code snippet in Structured Query Examples.
The following example defines a query that searches for the "Bush"
value within an element range index on title
.
StructuredQueryDefinition criteria = sb.wordConstraint("title", "Bush"); /* XML equivalent <search:query xmlns:search= "http://marklogic.com/appservices/search"> <search:word-constraint-query> <search:constraint-name>title</search:constraint-name> <search:text>Bush</search:text> </search:word-constraint-query> </search:query> */ /* JSON equivalent {"query":{ "word-constraint-query": { "constraint-name": "title", "text": "Bush" } } } */
For the boilerplate code environment in which this example runs, see the code snippet in Structured Query Examples.
The following example defines a query that searches for the "hello"
term in the value of any property.
StructuredQueryDefinition criteria = sb.properties(sb.term("hello")); /* XML equivalent <search:query xmlns:search= "http://marklogic.com/appservices/search"> <search:properties-fragment-query> <search:term-query> <search:text>hello</search:text> </search:term-query> </search:properties-fragment-query> </search:query> */ /* JSON equivalent {"query":{ "property-fragment-query": { "term-query": {, "text": "hello" } } } } */
For the boilerplate code environment in which this example runs, see the code snippet in Structured Query Examples.
The following example defines a query that searches for documents in the "http://testdoc/doc6/"
directory.
StructuredQueryDefinition criteria = sb.directory(true, "http://testdoc/doc6/"); /* XML equivalent <search:query xmlns:search= "http://marklogic.com/appservices/search"> <search:directory-query> <search:uri> <search:text>http://testdoc/doc6/</search:text> </search:uri> </search:directory-query> </search:query> */ /* JSON equivalent {"query":{ "directory-query": { "uri": {, "text": "http://testdoc/doc6/" } } } } */
For the boilerplate code environment in which this example runs, see the code snippet in Structured Query Examples.
The following example defines a query that searches for the "http://testdoc/doc6/"
document.
StructuredQueryDefinition criteria = sb.document("http://testdoc/doc2"); /* XML equivalent <search:query xmlns:search= "http://marklogic.com/appservices/search"> <search:document-query> <search:uri> <search:text>http://testdoc/doc2</search:text> </search:uri> </search:document-query> </search:query> */ /* JSON equivalent {"query":{ "document-query": { "uri": {, "text": "http://testdoc/doc2/" } } } } */
For the boilerplate code environment in which this example runs, see the code snippet in Structured Query Examples.
The following example defines a query that searches for documents containing a JSON property named .
StructuredQueryDefinition criteria = sb.containerQuery(sb.jsonProperty("myProp"), sb.term("theValue")); /* XML equivalent <search:query xmlns:search= "http://marklogic.com/appservices/search"> <search:container-query> <search:json-property>myProp</search:json-property> <search:term-query> <search:text>theValue</search:text> </search:term-query> </search:container-query> </search:query> */ /* JSON equivalent {"query":{ "container-query": { "json-property" : "myProp", "term-query": {, "text": "the-value" } } } } */
For the boilerplate code environment in which this example runs, see the code snippet in Structured Query Examples.
The following example defines a query that searches documents belonging to the "http://test.com/set3/set3-1"
collection.
StructuredQueryDefinition criteria = sb.collection("http://test.com/set3/set3-1"); /* XML equivalent <search:query xmlns:search= "http://marklogic.com/appservices/search"> <search:collection-query> <search:uri> <search:text>http://test.com/set3/set3-1</search:text> </search:uri> </search:collection-query> </search:query> */ /* JSON equivalent {"query":{ "collection-query": { "uri": {, "text": "http://test.com/set3/set3-1" } } } } */
This section describes how to use the Java API to perform a search using a Query By Example (QBE). A QBE enables rapid prototyping of queries for documents that look like this using search criteria that resemble the structure of documents in your database. If you are not familiar with QBE, see Searching Using Query By Example in Search Developer's Guide.
This section covers the following topics:
A Query By Example (QBE) enables rapid prototyping of queries for documents that look like this using search criteria that resemble the structure of documents in your database. If you are not familiar with QBE, see Searching Using Query By Example in Search Developer's Guide.
If your documents include an author
XML element or JSON property, you can use the following example QBE to find documents with an author
value of Mark Twain.
Format | Example |
---|---|
XML |
<q:qbe xmlns:q="http://marklogic.com/appservices/querybyexample"> <q:query> <author>Mark Twain</author> </q:query> </q:qbe> |
JSON |
{ "$query": { "author": "Mark Twain" } } |
You can only use QBE to search XML and JSON documents. Metadata search is not supported. You can search by element, element attribute, and JSON property; fields are not supported. For details, see Searching Using Query By Example in Search Developer's Guide
A QBE is represented by com.marklogic.client.query.RawQueryByExampleDefinition
in the Java API. Operations on a QBE are performed through a QueryManager
.
To create a QBE from a raw XML or JSON representation, use any handle class that implements com.marklogic.client.io.marker.StructureWriteHandle
to create a RawQueryByExampleDefinition
.
The Java API includes StructureWriteHandle
implementations that support creating a structure in XML or JSON from a string (StringHandle
), a file (FileHandle
), a stream (InputStreamHandle
), and popular abstractions (DOMHandle
, DOM4JHandle
, JDOMHandle
). For a complete list of implementations, see the Java API JavaDoc.
Follow this procedure to create a QBE and use it in a search:
QueryManager
. The manager deals with interaction between the client and the database.QueryManager queryMgr = client.newQueryManager();
String
for the raw representation:String rawXMLQuery =
"<q:qbe xmlns:q='http://marklogic.com/appservices/querybyexample'>"+
"<q:query>" +
"<author>Mark Twain</author>" +
"</q:query>" +
"</q:qbe>";
//Or
String rawJSONQuery =
"{" +
"\"$query\": { \"author\": \"Mark Twain\" }" +
"}";
StructureWriteHandle
, set the handle content format, and associate your query with the handle. For example:// For an query expressed as XML StringHandle rawHandle = new StringHandle(rawXMLQuery).withFormat(Format.XML); // For a query expressed as JSON StringHandle rawHandle = new StringHandle(rawJSONQuery).withFormat(Format.JSON);
RawQueryByExampleDefinition
from the handle. Optionally, include the name of persistent query options. For example:// Use the default persistent query options RawQueryByExampleDefinition querydef = queryMgr.newRawQueryByExampleDefinition(rawHandle); // Use the persistent options previously saved as "myoptions" RawQueryByExampleDefinition querydef = queryMgr.newRawQueryByExampleDefinition(rawHandle, "myoptions");
SearchHandle resultsHandle = queryMgr.search(querydef, new SearchHandle());
When you perform a search, MarkLogic Server does not verify the correctnesss of your QBE. If your QBE is syntactically or semantically incorrect, you might get errors or surprising results. To avoid such issues, you can validate your QBE.
To validate a QBE, construct a query as described in Search Documents Using a QBE, and then pass it to QueryManager.validate()
instead of QueryManager.search()
. The validation report is returned in a StructureReadHandle
. For example:
StringHandle validationReport = queryMgr.validate(qbeDefn, new StringHandle());
The report can be in XML or JSON format, depending on the format of the input query and the format you set on the handle. By default, validation returns a JSON report for a JSON input query and an XML report for an XML input query. You can override this behavior using the withFormat()
method of your response handle.
Generating a combined query from a QBE has the following potential benefits:
A combined query combines a structured query and query options into a single XML or JSON query. For details, see Apply Dynamic Query Options to Document Searches.
To generate a combined query from a QBE, construct a query as described in Search Documents Using a QBE, and then pass it to QueryManager.convert()
instead of QueryManager.search()
. The results are returned in a StructureReadHandle
. For example:
StringHandle combinedQueryHandle = queryMgr.convert(qbeDefn, new StringHandle());
The resulting handle can be used to construct a RawCombinedQueryDefinition
; for details, see Searching Using Combined Query.
For more details on the query component of a combined query, see Searching Using Structured Queries in Search Developer's Guide.
You can use a combined query to specify query options at query time, without first persisting them as named options. A combined query is an XML or JSON wrapper around a string query and/or a structured, cts, or QBE query, plus query options.
The Java Client API does not support using a QBE in a combined query at this time. Use a standalone QBE and persistent query options instead.
This section covers the following topics:
Combined queries are useful for rapid prototyping during development and for applications that need to modify query options on a per query basis. The RawCombinedQueryDefinition
class represents a combined query in the Java API.
You can only create a combined query from raw XML or JSON; there is no builder class. A combined query can contain the following components, all optional:
If you include both a string query and a structured query or cts query, the two queries are AND'd together.
For example, the following raw combined query uses a string query and a structured query to match all documents where the TITLE element contains the word henry and the term fourth. The options embedded in the query suppress the generation of snippets and extract just the /PLAY/TITLE element from the matched documents.
For syntax details, see Syntax and Semantics in the REST Application Developer's Guide.
Since there is no builder for RawCombinedQueryDefinition
, you must construct the contents by hand, associate a handle with the contents, and then attach the handle to a RawCombinedQueryDefinition
object. For example:
RawCombinedQueryDefinition xmlCombo = qm.newRawCombinedQueryDefinition(new StringHandle().with( // your raw XML combined query here ).withFormat(Format.XML)); // your raw JSON combined query here ).withFormat(Format.JSON));
For more complete examples, see Combined Query Examples.
Use any handle class that implements com.marklogic.client.io.marker.StructureWriteHandle
. The Java API includes StructureWriteHandle
implementations that support creating a structure in XML or JSON from input sources such as a string (StringHandle
), a file (FileHandle
), a stream (InputStreamHandle
), and popular abstractions (DOMHandle
, DOM4JHandle
, JDOMHandle
). For a complete list of implementations, see the Java Client API Documentation.
Though there is no builder for combined queries, you can use StructuredQueryBuilder
to create the structured query portion of a combined query; for details, see Creating a Combined Query Using StructuredQueryBuilder.
The following procedure provides more detailed instructions for binding a handle on the raw representation RawCombinedQueryDefinition
object usable for searching.
QueryManager
. The manager deals with interaction between the client and the database. For example:QueryManager queryMgr = client.newQueryManager();
String
for the raw representation of a combined query that contains a structured query:String rawXMLQuery =
"<search:search "+
"xmlns:search='http://marklogic.com/appservices/search'>"+
"<search:query>"+
"<search:term-query>"+
"<search:text>neighborhoods</search:text>"+
"</search:term-query>"+
"<search:value-constraint-query>"+
"<search:constraint-name>industry</search:constraint-name>"+
"<search:text>Real Estate</search:text>"+
"</search:value-constraint-query>"+
"</search:query>"+
"<search:options>"+
"<search:constraint name='industry'>"+
"<search:value>"+
"<search:element name='industry' ns=''/>"+
"</search:value>"+
"</search:constraint>"+
"</search:options>"+
"</search:search>";
//Or
String rawJSONQuery =
"{\"search\":{" +
" \"query\": {" +
" \"term-query\": {" +
" \"text\": \"neighborhoods\"" +
" }," +
" \"value-constraint-query\": {" +
" \"constraint-name\": \"industry\"," +
" \"text\": \"Real Estate\"" +
" }" +
" }," +
" \"options\": {" +
" \"constraint\": {" +
" \"name\": \"industry\"," +
" \"value\": {" +
" \"element\": {" +
" \"name\": \"industry\"," +
" \"ns\": \"\"" +
" }" +
" }" +
" }" +
" }" +
"}" +
"}";
StructureWriteHandle
. For example:// Query as XML StringHandle rawHandle = new StringHandle().withFormat(Format.XML).with(rawXMLQuery); // Query as JSON StringHandle rawHandle = new StringHandle().withFormat(Format.JSON).with(rawJSONQuery);
RawCombinedQueryDefinition
from the handle. Optionally, include the name of persistent query options. For example:// Use the default persistent query options RawCombinedQueryDefinition querydef = queryMgr.newRawCombinedQueryDefinition(rawHandle); // Use persistent options previously saved as "myoptions" RawCombinedQueryDefinition querydef = queryMgr.newRawCombinedQueryDefinition(rawHandle, "myoptions");
RawCombinedQueryDefinition
and a results handle.SearchHandle resultsHandle = queryMgr.search(querydef, new SearchHandle());
For a complete example of searching with a combined query, see com.marklogic.client.example.cookbook.RawCombinedSearch
in the example/
directory of your Java API installation.
When building a RawCombinedQuery
that contains a structured query, you can use StructuredQueryBuilder
to create the structured query portion of a combined query. This technique always produces an XML combined query.
Create a StructuredQueryDefinition
using StructuredQueryBuilder
, just as you would when searching with a standalone structured query. Then, extract the serialized structured query using StructuredQueryDefinition.serialize
, and embed it in your combined query. For example:
QueryManager qm = client.newQueryManager(); StructuredQueryBuilder qb = qm.newStructuredQueryBuilder(); StructuredQueryDefinition structuredQuery = qb.word(qb.element("TITLE"), "henry"); String comboq = "<search xmlns=\"http://marklogic.com/appservices/search\">" + structuredQuery.serialize() + "</search>"; RawCombinedQueryDefinition query = qm.newRawCombinedQueryDefinition( new StringHandle(comboq).withFormat(Format.XML));
You can also include a string query and/or query options in your combined query. For a more complete example, see Combined Query Examples.
Dynamic query options supplied in a combined query are merged with persistent and default options that are in effect for the search. If the same non-constraint option is specified in both the combined query and persistent options, the setting in the combined query takes precedence.
Constraints are overridden by name. That is, if the dynamic and persistent options contain a <constraint/>
element with the same name
attribute, the definition in the dynamic query options is the one that applies to the query. Two constraints with different name are both merged into the final options.
For example, suppose the following query options are installed under the name my-options
:
<options xmlns="http://marklogic.com/appservices/search"> <fragment-scope>properties</fragment-scope> <return-metrics>false</return-metrics> <constraint name="same"> <collection prefix="http://server.com/persistent/"/> </constraint> <constraint name="not-same"> <element-query name="title" ns="http://my/namespace" /> </constraint> </options>
Further, suppose you use the following raw XML combined query to define dynamic query options:
<search xmlns="http://marklogic.com/appservices/search"> <options> <return-metrics>true</return-metrics> <debug>true</debug> <constraint name="same"> <collection prefix="http://server.com/dynamic/"/> </constraint> <constraint name="different"> <element-query name="scene" ns="http://my/namespace" /> </constraint> </options> </search>
You can create a RawQueryDefinition
that encapsulates the combined query and the persistent options:
StringHandle rawQueryHandle = new StringHandle(...).withFormat(Format.XML); RawCombinedQueryDefinition querydef = queryMgr.newRawCombinedQueryDefinition( rawQueryHandle, "my-options");
The query is evaluated with the following merged options. The persistent options contribute the fragment-scope
option and the constraint named not-same
. The dynamic options in the combined query contribute the return-metrics
and debug
options and the constraints named same
and different
. The return-metrics
setting and the constraint named same
from my-options
are discarded.
<options xmlns="http://marklogic.com/appservices/search"> <fragment-scope>properties</fragment-scope> <return-metrics>true</return-metrics> <debug>true</debug> <constraint name="same"> <collection prefix="http://server.com/dynamic/"/> </constraint> <constraint name="different"> <element-query name="scene" ns="http://my/namespace" /> </constraint> <constraint name="not-same"> <element-query name="title" ns="http://my/namespace" /> </constraint> </options>
The examples in this section demonstrate constructing different types of combined queries using the Java Client API. The example queries are constructed as in-memory strings to keep the example self-contained, but you could just as easily read them from a file or other external source.
Unless otherwise noted, the examples all use equivalent queries and query options. The query is a word query on the term henry where it appears in a TITLE element, AND'd with a string query for the term henry.
The examples also share the scaffolding in Shared Scaffolding for Combined Query Examples, which defines the query options and drives the search. However, the primary point of the examples is the query construction.
See the following topics for example code:
The following two functions perform a search using a combined query that contains a string query, a structured query, and query options.
The first function expresses the query in XML, using StructuredQueryBuilder to create the structured query portion of the combined query. The second function expresses the query in JSON. Both functions use the options and search driver from Shared Scaffolding for Combined Query Examples.
// Use a combined query containing a structured query, string query, // and query options. A StructuredQueryBuilder is used to create the // structured query portion. The combined query is expressed as XML. // public static void withXmlStructuredQuery() { StructuredQueryBuilder qb = new StructuredQueryBuilder(); StructuredQueryDefinition builtSQ = qb.word(qb.element("TITLE"), "henry"); System.out.println("** Searching with an XML structured query..."); doSearch(new StringHandle().with( "<search xmlns=\"http://marklogic.com/appservices/search\">" + "<qtext>fourth</qtext>" + builtSQ.serialize() + XML_OPTIONS + "</search>").withFormat(Format.XML)); } // Use a combined query containing a structured query, string query, // and query options. The combined query is expressed as JSON. public static void withJsonStructuredQuery() { System.out.println("** Searching with a JSON structured query..."); doSearch(new StringHandle().with( "{\"search\" : {" + "\"query\": {" + "\"word-query\": {" + "\"element\": { \"name\": \"TITLE\"}," + "\"text\": [ \"henry\" ]" + "}" + "}, " + "\"qtext\": \"fourth\"," + JSON_OPTIONS + "} }").withFormat(Format.JSON)); }
The following two functions perform a search using a combined query that contains a string query, a cts query, and query options.
The first function expresses the query in XML. The second function expresses the query in JSON. Both functions use the options and search driver from Shared Scaffolding for Combined Query Examples.
// Use a combined query containing a cts query, string query, // and query options. The combined query is expressed as XML. public static void withXmlCtsQuery() { System.out.println("** Searching with an XML cts query..."); doSearch(new StringHandle().with( "<search xmlns=\"http://marklogic.com/appservices/search\">" + "<cts:element-word-query xmlns:cts=\"http://marklogic.com/cts\">" + "<cts:element>TITLE</cts:element>" + "<cts:text xml:lang=\"en\">henry</cts:text>" + "</cts:element-word-query>" + "<qtext>fourth</qtext>" + XML_OPTIONS + "</search>").withFormat(Format.XML)); } // Use a combined query containing a cts query, string query, // and query options. The combined query is expressed as JSON. public static void withJsonCtsQuery() { System.out.println("** Searching with a JSON cts query..."); doSearch(new StringHandle().with( "{\"search\" : {" + "\"ctsquery\": {" + "\"elementWordQuery\": {" + "\"element\" : [\"TITLE\"]," + "\"text\" : [\"henry\"]," + "\"options\" : [\"lang=en\"]" + "}" + "}, " + "\"qtext\": \"fourth\"," + JSON_OPTIONS + "} }").withFormat(Format.JSON)); }
The examples in Combined Query Examples share the scaffolding in this section for connecting to MarkLogic, defining query options, performing a search, and displaying the search results.
The query options are designed to strip down the search results into something easy for the example code to process while still emitting simple but meaningful output. This is done by suppressing snippeting and using the extract-document-data
option to return just the TITLE element from the matches.
The doSearch
method performs the search, independent of the structure of the combined query, and prints out the matched titles. The shown result processing is highly dependent on the query options and structured of the example documents.
package examples; import javax.xml.xpath.XPathExpression; import javax.xml.xpath.XPathFactory; import org.w3c.dom.Document; import com.marklogic.client.DatabaseClient; import com.marklogic.client.DatabaseClientFactory; import com.marklogic.client.io.Format; import com.marklogic.client.io.SearchHandle; import com.marklogic.client.io.StringHandle; import com.marklogic.client.io.marker.StructureWriteHandle; import com.marklogic.client.query.ExtractedItem; import com.marklogic.client.query.ExtractedResult; import com.marklogic.client.query.MatchDocumentSummary; import com.marklogic.client.query.QueryManager; import com.marklogic.client.query.RawCombinedQueryDefinition; import com.marklogic.client.query.StructuredQueryBuilder; import com.marklogic.client.query.StructuredQueryDefinition; import javax.xml.xpath.XPathExpressionException; public class CombinedQuery { // replace with your MarkLogic Server connection information static String HOST = "localhost"; static int PORT = 8000; static String DATABASE = "bill"; static String USER = "username"; static String PASSWORD = "password"; private static DatabaseClient client = DatabaseClientFactory.newClient( HOST, PORT, DATABASE, new DatabaseClientFactory.DigestAuthContext(USER, PASSWORD)); // Define query options to be included in our raw combined query. static String XML_OPTIONS = "<options xmlns=\"http://marklogic.com/appservices/search\">" + "<extract-document-data>" + "<extract-path>/PLAY/TITLE</extract-path>" + "</extract-document-data>" + "<transform-results apply=\"empty-snippet\"/>" + "<search-option>filtered</search-option>" + "</options>"; static String JSON_OPTIONS = "\"options\": {" + "\"extract-document-data\": {" + "\"extract-path\": \"/PLAY/TITLE\"" + "}," + "\"transform-results\": {" + "\"apply\": \"empty-snippet\"" + "}" + "}"; // Perform a search using a combined query. The input handle is // assumed to contain an XML or JSON combined query. // // The combined query must contain either the XML_OPTIONS or // JSON_OPTIONS defined above. The options produce a // search:response in which each search:match has this form: // // <search:result index="n" uri="..." path="..." score="..." // confidence="....4450079" fitness="0.5848901" href="..." // mimetype="..." format="xml"> // <search:snippet/> // <search:extracted kind="element"> // <TITLE>a title</TITLE> // </search:extracted> // </search:result> // // XML DOM is used to extract the title text from the extrace elems // public static void doSearch(StructureWriteHandle queryHandle) { // Create a raw combined query QueryManager qm = client.newQueryManager(); RawCombinedQueryDefinition query = qm.newRawCombinedQueryDefinition(queryHandle); // Perform the search SearchHandle results = qm.search(query, new SearchHandle()); // Process the results, printint out the title of each match try { XPathExpression xpath = XPathFactory.newInstance() .newXPath().compile("//TITLE"); for (MatchDocumentSummary match : results.getMatchResults()) { ExtractedResult extracted = match.getExtracted(); if (!extracted.isEmpty()) { for (ExtractedItem item : extracted) { System.out.println( xpath.evaluate(item.getAs(Document.class))); } } } } catch (XPathExpressionException e) { e.printStackTrace(); } } // with*Query methods go here public static void main(String[] args) { // call with*Query methods of interest to you }
Using persistent query options usually performs better than using dynamic query options. In most cases, the performance difference between the two methods is slight.
When MarkLogic Server processes a combined query, the per request query options must be parsed and merged with named and default options on every search. When you only use persistent named or default query options, you reduce this overhead.
If your application does not require dynamic per-request query options, you should use a QueryOptionsManager
to persist your options under a name and associate the options with a simple StringQueryDefinition
or StructuredQueryDefinition
.
You can return values and tuples (co-occurrences) through the Java API. Value and tuple searches require the appropriate range indexes are configured on your MarkLogic Server database. For background on values and co-occurrences, see Browsing With Lexicons in the Search Developer's Guide.
This section includes the following parts:
The following returns values through the Java API:
The following are the basic steps to search on values:
QueryManager
. The manager deals with interaction between the client and the database.QueryManager queryMgr = client.newQueryManager();
ValuesDefinition
object using the query manager. In the following example, the parameters define a named values constraint (myvalue
) defined in previously persisted query options (valueoptions
):// build a search definition ValuesDefinition vdef = queryMgr.newValuesDefinition("myvalue", "valuesoptions");
setAggregate()
to set the name of the aggregate function to be applied as part of the query. vdef.setAggregate("correlation", "covariance");
ValuesDefinition
object as an argument, returning a ValuesHandle
object. Note that the tuples search method is called values()
, not search()
.ValuesHandle results = queryMgr.values(vdef, new ValuesHandle());
You can retrieve results one page at a time by defining a page length and starting position with the QueryManager
interface. For example, the following code snippet retrieves a page of 5 values beginning with the 10th value.
queryMgr.setPageLength(5); ValuesHandle result = queryMgr.values(vdef, new ValuesHandle(), 10);
For more information on values search concepts, see Returning Lexicon Values With search:values and Browsing With Lexicons in the Search Developer's Guide.
The following returns tuples (co-occurrences) through the Java API:
QueryManager
. The manager deals with interaction between the client and the database.QueryManager queryMgr = client.newQueryManager();
ValuesDefinition
object using the query manager. In the following example, the parameters define a named tuples constraint (co
) defined in previously persisted query options (tupleoptions
): // build a search definition ValuesDefinition vdef = queryMgr.newValuesDefinition("co", "tupleoptions");
ValuesDefinition
object as an argument, returning a TuplesHandle
object. Note that the tuples search method is called tuples()
, not search()
.TuplesHandle results = queryMgr.tuples(vdef, new TuplesHandle());
You can retrieve results one page at a time by defining a page length and starting position with the QueryManager
interface. For example, the following code snippet retrieves a page of 5 tuples beginning with the 10th one.
queryMgr.setPageLength(5); TuplesHandle result = queryMgr.tuples(vdef, new TuplesHandle(), 10);
For more information on tuples search concepts, see Returning Lexicon Values With search:values and Browsing With Lexicons in the Search Developer's Guide.
You can constrain the results of a values or tuples query to only return values in documents matching the constraining query. The constraining query can be a string, structured, combined, or cts query.
To add a constraining query to a values or tuples query, construct the query definition as usual and then attach it to the values or tuples query using the ValuesDefinition.setQueryDefintion
method.
The following example adds a constraining cts:query
to a values query, assuming a set of query options are installed under the name valopts that defines a values
option named title. Only values in documents matching the cts:element-word-query will be returned.
QueryManager qm = client.newQueryManager(); // Create a cts:query with which to constrain the values query result String serializedQuery = "<cts:element-word-query xmlns:cts=\"http://marklogic.com/cts\">" + "<cts:element>TITLE</cts:element>" + "<cts:text xml:lang=\"en\">fourth</cts:text>" + "</cts:element-word-query>"; RawCtsQueryDefinition ctsquery = qm.newRawCtsQueryDefinition( new StringHandle(serializedQuery).withFormat(Format.XML)); // Create a values query and evaluate it ValuesDefinition vdef = qm.newValuesDefinition("title", "valopts"); vdef.setQueryDefinition(ctsquery); ValuesHandle results = qm.values(vdef, new ValuesHandle());
All query definition interfaces have setCollections()
and setDirectory()
methods. By calling setDirectory(
directory_URI_string)
on your query definition, you limit your search to that directory. By calling setCollections(
list_of_collection_name_strings)
on your query definition, you limit your search to those collections. You can call both and limit your search to collections and a single directory.
Values metadata, sometimes called key-value metadata, can only be searched if you define a metadata field on the keys you want to search. Once you define a field on a metadata key, use the normal field search capabilities to include a metadata field in your search. For example, you can use a cts:field-word-query or a structured query word-query
on a metadata field, or define a constraint on the field and use the constraint in a string query.
For more details, see Metadata Fields in the Administrator's Guide. For some examples, see Example: Structured Search on Key-Value Metadata Fields or Searching Key-Value Metadata Fields in the Search Developer's Guide.
You can make arbitrary changes to the results of a search or values query by applying a server-side transformation function to the results. This section covers the following topics:
Search response transforms use the same interface and framework as content transformations applied during document ingestion, described in Writing Transformations in the REST Application Developer's Guide.
Your transform function receives the XML or JSON search response prepared by MarkLogic Server in the content
parameter. For example, if the response is XML, then the content passed to your transform is a document node with a <search:response/>
root element. Any customizations made by the transform-results
query option or result decorators are applied before calling your transform function.
You can probe the document type to test whether the input to your transform receives JSON or XML input. For example, in server-side JavaScript, you can test the documentFormat property of a document node:
function myTransform(context, params, content) { if (content.documentFormat == "JSON") { // handle as JSON or a JavaScript object } else { // handle as XML } ... }
In XQuery and XSLT, you can test the node kind of the root of the document, which will be element
for XML and object
for JSON.
declare function dumper:transform( $context as map:map, $params as map:map, $content as document-node() ) as document-node() { if (xdmp:node-kind($content/node() eq "element") then(: process as XML :) else (: process as JSON :)
As with read and write transforms, the content object is immutable in JavaScript, so you must call toObject to create a mutable copy:
var output = content.toObject(); ...modify output... return output;
The type of document you return must be consistent with the output-type
(outputType
) context value. If you do not return the same type of document as was passed to you, set the new output type on the context
parameter.
To use a server transform function:
QueryDefinition
by calling setResponseTransform()
. For example:QueryManager queryMgr = dbClient.newQueryManager(); StringQueryDefinition query = queryMgr.newStringDefinition(); query.setCriteria("cat AND dog"); query.setResponseTransform(new ServerTransform("example"));
You are responsible for specifying a handle type capable of interpreting the results produced by your transform function. The SearchHandle
implementation provided by the Java API only understands the search results structure that MarkLogic Server produces by default.
Use com.marklogic.client.query.QueryManager.suggest()
to generate search term completion suggestions that match a wildcard terminated string. For example, if the user enters the text doc into a search box, you can use suggest()
with doc as string criteria to retrieve a list of terms matching doc*, and then display them to user. This service is analogous to calling the XQuery function search:suggest or the REST API method GET /version/suggest
.
The following topics are covered:
Use the following procedure to retrieve search term completion suggestions:
default-suggestion-source
or suggestion-source option
. For details, see Search Term Completion Using search:suggest in the Search Developer's Guide and Creating Persistent Query Options From Raw JSON or XML.QueryManager
. The manager deals with interaction between the client and the database.QueryManager queryMgr = client.newQueryManager();
SuggestDefinition
object.SuggestDefinition sd = queryMgr.newSuggestDefinition();
sd.setStringCriteria("doc");
suggestion-source
or default-suggestion-source
options. Otherwise, specify the name of previously installed query options that include suggestion-source
and/or default-suggestion-source
settings.sd.setOptions("opt-suggest");
sd.setLimit(5); sd.setQueryStrings("prefix:xdmp");
String[] results = queryMgr.suggest(sd);
This example walks you through configuring your database and REST instance to try retrieving search suggestions. The Documents database is assumed in this example, but you can use any database. This example has the following parts:
Run the following query in Query Console to load the sample data into your database, or use a DocumentManager
to insert equivalent documents into the database. The example will retrieve suggestions for the <name/>
element, with and without a constraint based on the <prefix/>
element.
xdmp:document-insert("/suggest/load.xml", <function> <prefix>xdmp</prefix> <name>document-load</name> </function> ); xdmp:document-insert("/suggest/insert.xml", <function> <prefix>xdmp</prefix> <name>document-insert</name> </function> ); xdmp:document-insert("/suggest/query.xml", <function> <prefix>cts</prefix> <name>document-query</name> </function> ); xdmp:document-insert("/suggest/search.xml", <function> <prefix>cts</prefix> <name>search</name> </function> );
declareUpdate(); xdmp.documentInsert(
"/suggest/load.json", {function: {prefix: "xdmp", name: "document-load"} });xdmp.documentInsert(
"/suggest/insert.json", {function: {prefix: "xdmp", name: "document-insert"} });xdmp.documentInsert(
"/suggest/query.json", {function: {prefix: "cts", name: "document-query"} });xdmp.documentInsert(
"/suggest/load.search", {function: {prefix: "cts", name: "document-search"} });
To create the range index used by the example, run the following query in Query Console, or use the Admin Interface to create an equivalent index on the name
element. The following query assumes you are using the Documents database; modify as needed.
xquery version "1.0-ml"; import module namespace admin = "http://marklogic.com/xdmp/admin" at "/MarkLogic/admin.xqy"; admin:save-configuration( admin:database-add-range-element-index( admin:get-configuration(), xdmp:database("Documents"), admin:database-range-element-index( "string", "http://marklogic.com/example", "name", "http://marklogic.com/collation/", fn:false()) ) );
declareUpdate(); const admin = require("/MarkLogic/admin.xqy"); admin.saveConfiguration(
admin.databaseAddRangeElementIndex(admin.getConfiguration(),
xdmp.database("Documents"),
admin.databaseRangeElementIndex("string"
"http://marklogic.com/example",
"name",
"http://marklogic.com/collation/",
fn.false())) )
The example relies on the following query options. These options use the <name/>
element as the default suggestion source. The value constraint named prefix is included only to illustrate how to use additional query to filter suggestions. It is not required to get suggestions.
<options xmlns="http://marklogic.com/appservices/search"> <default-suggestion-source> <range type="xs:string" facet="true"> <element ns="http://marklogic.com/example" name="name"/> </range> </default-suggestion-source> <constraint name="prefix"> <value> <element ns="http://marklogic.com/example" name="prefix"/> </value> </constraint> </options>
{"options":{
"default-suggestion-source": {
"range": {
"facet": "true",
"element": {
"ns": "http://marklogic.com/example",
"name": "name"
}
}
},
"constraint": {
"name": "prefix",
"value": {
"element": {
"ns": "http://marklogic.com/example",
"name": "prefix"
}
}
}
}
}
Install the options under the name "opt-suggest"
using QueryOptionsManager
, as described in Creating Persistent Query Options From Raw JSON or XML. For example, to configure the options using a string literal, do the following:
String options =
"<options xmlns=\"http://marklogic.com/appservices/search\">" +
"<default-suggestion-source>" +
"<range type="xs:string" facet="true">" +
"<element ns="http://marklogic.com/example" name="name"/>" +
"</range>" +
"</default-suggestion-source>" +
"<constraint name="prefix">" +
"<value>
"<element ns="http://marklogic.com/example" name="prefix"/>" +
"</value>" +
"</constraint>" +
"</options>";
// Or the JSON equivalent:
String optionsJson =
"{\"options\":{" +
" \"default-suggestion-source\": {" +
" \"range\": {" +
" \"facet\": \"true\"," +
" \"element\": {" +
" \"ns\": \"http://marklogic.com/example\"," +
" \"name\": \"name\"" +
" }" +
" }" +
" }," +
" \"constraint\": {" +
" \"name\": \"prefix\"," +
" \"value\": {" +
" \"element\": {" +
" \"ns\": \"http://marklogic.com/example\"," +
" \"name\": \"prefix\"" +
" }" +
" }" +
" }" +
"}" +
"}";
StringHandle handle =
new StringHandle(options).withFormat(Format.XML);
QueryManager queryMgr = client.newQueryManager();
QueryOptionsManager optMgr =
client.newServerConfigManager().newQueryOptionsManager();
optMgr.writeOptions("opt-suggest", handle);
To retrieve search suggestions, use QueryManager.suggest()
. For example:
QueryManager queryMgr = client.newQueryManager(); SuggestDefinition sd = queryMgr.newSuggestDefinition(); sd.setStringCriteria("doc"); String[] results = queryMgr.suggest(sd);
The results contain the following suggestions derived from the sample input documents:
document-insert document-load document-query
Recall that the query options include a value constraint on the prefix
element. You can use this constraint with the string query prefix:xdmp
as filter so that the operation returns only suggestions occuring in a documents with a prefix
value of xdmp. For example:
sd.setStringCriteria("doc"); sd.setQueryStrings("prefix:xdmp"); String[] results = queryMgr.suggest(sd);
Now, the results contain only document-insert
and document-load
. The function named document-query
is excluded because the prefix
value for this document is not xdmp.
For more details on using search suggestions, including performance recommendations and additional examples, see the following:
This section describes how to use the extract-document-data
query option with QueryManager.search
to extract a subset of each matching document and return it in your search results.
This section covers the following related topics:
You can also use this option with a multi-document read (DocumentManager.search
) to retrieve the extracted subset instead of the complete document; for details, see Extracting a Portion of Each Matching Document.
By default, QueryManager.search
returns a search result summary. When you perform a search that includes the extract-document-data
query option, you can embed selected portions of each matching document in the search results and access them through returned Handle.
The projected contents are specified through absolute XPath expressions in extract-document-data
and a selected
attribute that specifies how to treat the selected content.
The extract-document-data
option has the following general form. For details, see extract-document-data in the Search Developer's Guide and Extracting a Portion of Matching Documents in the Search Developer's Guide.
<extract-document-data selected="howMuchToInclude"> <extract-path>/path/to/content</extract-path> </extract-document-data>
{"extract-document-data":{
"selected": "howMuchToInclude",
"extract-path": "/path/to/content"
}
}
The path expression in extract-path
is limited to the subset of XPath described in The extract-document-data Query Option in the XQuery and XSLT Reference Guide.
Use the selected
attribute to control what to include in each result. This attribute can take on the following values: all, include, include-with-ancestors, and exclude. For details, see Search Developer's Guide.
The document projections created with extract-document-data
are accessible in the following way. For a complete example, see Example: Extracting a Portion of Each Matching Document.
QueryManager qm = client.newQueryManager(); SearchHandle results = qm.search(query, new SearchHandle()); MatchDocumentSummary matches[] = results.getMatchResults(); for (MatchDocumentSummary match : matches) { ExtractedResult extracts = match.getExtracted(); for (ExtractedItem extract: extracts) { // do something with each projection } }
The ExtractedItem
interface includes get
and getAs
methods for manipulating the extracted content through either a handle (ExtractedItem.get
) or an object (ExtractedItem.getAs
). For example, the following statement uses getAs
to access the extracted content as a String
:
String content = extract.getAs(String.class);
You can use ExtractedResult.getFormat
with ExtractedItem.get
to detect the type of data returned and access the content with a type-specific handle. For example:
for (MatchDocumentSummary match : matches) { ExtractedResult extracts = match.getExtracted(); for (ExtractedItem extract: extracts) { if (match.getFormat() == Format.JSON) { JacksonHandle handle = extract.get(new JacksonHandle()); // use the handle contents } else if (match.getFormat() == Format.XML) { DOMHandle handle = extract.get(new DOMHandle()); // use the handle contents } } }
The search returns an ExtractedItem
for each match to a path in a given document when you set select
to include. For example, if your extract-document-data
option includes multiple extraction paths, you can get an ExtractedItem
for each path. Similarly, if a single document contains more than one match for a single path, you get an ExtractedItem for each match.
By contrast, when you set select
to all, include-with-ancestors, or exclude, you get a single ExtractedItem
per document that contains a match.
Use the following technique to perform a search that includes extracted data in the search results. For a complete example of applying this pattern, see Example: Extracting a Portion of Each Matching Document.
QueryManager
. The manager deals with interaction between the client and the database.QueryManager queryMgr = client.newQueryManager();
extract-document-data
option. Make the option available to your search by embedding it in the options of a combined query or installing it as part of a named persistent query options set. The following example uses the option in a String that can be used to construct a RawCombinedQuery
:String rawQuery =
"<search xmlns=\"http://marklogic.com/appservices/search\">" +
" <query><directory-query><uri>/extract/</uri></directory-query></query>" +
" <options xmlns=\"http://marklogic.com/appservices/search\">" +
" <extract-document-data selected=\"include\">" +
" <extract-path>/parent/body/target</extract-path>" +
" </extract-document-data>" +
" </options>" +
"</search>";
//The equivalent in JSON:
String rawQueryJson =
"{\"search\":{" +
" \"query\": {" +
" \"directory-query\": {" +
" \"uri\": \"/extract/\"" +
" }" +
" }," +
" \"options\": {" +
" \"extract-document-data\": {" +
" \"selected\": \"include\"," +
" \"extract-path\": \"/parent/body/target\"" +
" }" +
" }" +
"}" +
"}";
For details, see Prototype a Query Using Query By Example or Using QueryOptionsManager To Delete, Write, and Read Options.
StringHandle qh = new StringHandle(rawQuery).withFormat(Format.XML); //Or with rawQueryJson StringHandle qh = new StringHandle(rawQueryJson).withFormat(Format.JSON); QueryManager qm = client.newQueryManager(); RawCombinedQueryDefinition query = qm.newRawCombinedQueryDefinition(qh);
extract-document-data
.SearchHandle results = qm.search(query, new SearchHandle());
MatchDocumentSummary matches[] = results.getMatchResults(); for (MatchDocumentSummary match : matches) { ExtractedResult extracts = match.getExtracted(); for (ExtractedItem extract: extracts) { // do something with each projection } }
If you do not use a SearchHandle
to capture your search results, you must access the extracted content from the raw search results. For details on the layout, see Extracting a Portion of Matching Documents in the Search Developer's Guide.
This example demonstrates the use of the extract-document-data
query option to embed a selected subset of data from matched documents in the search results. For an example of using extract-document-data as part of a multi-document read, see Extracting a Portion of Each Matching Document.
The example documents are inserted into the /extract/ directory in the database to make them easy to manage in the example. The example data includes one XML document and one JSON document, structured such that a single XPath expression can be used to demonstrate using extract-document-data
on both types of document.
The example documents have the following contents, with the bold portion being the content extracted using the XPath expression /parent/body/target
.
JSON: {"parent": { "a": "foo", "body": { "target": "content1" }, "b": "bar" }} XML: <parent> <a>foo</a> <body> <target>content2</target> </body> <b>bar</b> </parent>
The example uses a RawCombinedQuery
that contains a directory-query
structured query and query options that include the extract-document-data
option. The example creates the combined query from a string literal, but you can also use StructuredQueryBuilder
to create the query portion of the combined query. For details, see Creating a Combined Query Using StructuredQueryBuilder.
The following example program inserts some documents into the database, performs a search that uses the extract-document-data
query option, and then deletes the documents. Before running the example, modify the values of HOST
, PORT
, USER
, and PASSWORD
to match your environment.
package com.marklogic.examples;
import org.w3c.dom.Document;
import com.marklogic.client.document.DocumentWriteSet;
import com.marklogic.client.document.GenericDocumentManager;
import com.marklogic.client.io.*;
import com.marklogic.client.query.DeleteQueryDefinition;
import com.marklogic.client.query.ExtractedItem;
import com.marklogic.client.query.ExtractedResult;
import com.marklogic.client.query.MatchDocumentSummary;
import com.marklogic.client.query.QueryManager;
import com.marklogic.client.query.RawCombinedQueryDefinition;
import com.marklogic.client.DatabaseClientFactory;
import com.marklogic.client.DatabaseClient;
import com.marklogic.client.DatabaseClientFactory.DigestAuthContext;
public class ExtractExample {
// replace with your MarkLogic Server connection information
static String HOST = "localhost";
static int PORT = 8000;
static String USER = "username";
static String PASSWORD = "password";
static DatabaseClient client = DatabaseClientFactory.newClient(
HOST, PORT,
new DigestAuthContext(USER, PASSWORD));
static String DIR = "/extract/";
// Insert some example documents in the database.
public static void setup() {
StringHandle jsonContent = new StringHandle(
"{\"parent\": {" +
"\"a\": \"foo\"," +
"\"body\": {" +
"\"target\": \"content1\"" +
"}," +
"\"b\": \"bar\"" +
"}}").withFormat(Format.JSON);
StringHandle xmlContent = new StringHandle(
"<parent>" +
"<a>foo</a>" +
"<body><target>content2</target></body>" +
"<b>bar</b>" +
"</parent>").withFormat(Format.XML);
GenericDocumentManager gdm = client.newDocumentManager();
DocumentWriteSet batch = gdm.newWriteSet();
batch.add(DIR + "doc1.json", jsonContent);
batch.add(DIR + "doc2.xml", xmlContent);
gdm.write(batch);
}
// Perform a search with RawCombinedQueryDefinition that extracts
// just the "target" element or property of docs in DIR.
public static void example() {
String rawQuery =
"<search xmlns=\"http://marklogic.com/appservices/search\">" +
" <query>" +
" <directory-query><uri>" + DIR + "</uri></directory-query>" +
" </query>" +
" <options>" +
" <extract-document-data selected=\"include\">" +
" <extract-path>/parent/body/target</extract-path>" +
" </extract-document-data>" +
" </options>" +
"</search>";
//The equivalent in JSON:
String rawQueryJson =
"{\"search\":{" +
" \"query\": {" +
" \"directory-query\": {" +
" \"uri\": \"/extract/\"" +
" }" +
" }," +
" \"options\": {" +
" \"extract-document-data\": {" +
" \"selected\": \"include\"," +
" \"extract-path\": \"/parent/body/target\"" +
" }" +
" }" +
"}" +
"}";
StringHandle qh =
new StringHandle(rawQuery).withFormat(Format.XML);
// Or with rawQueryJson
new StringHandle(rawQueryJson).withFormat(Format.JSON);
QueryManager qm = client.newQueryManager();
RawCombinedQueryDefinition query =
qm.newRawCombinedQueryDefinition(qh);
SearchHandle results = qm.search(query, new SearchHandle());
System.out.println(
"Total matches: " + results.getTotalResults());
MatchDocumentSummary matches[] = results.getMatchResults();
for (MatchDocumentSummary match : matches) {
System.out.println("Extracted from uri: " + match.getUri());
ExtractedResult extracts = match.getExtracted();
for (ExtractedItem extract: extracts) {
System.out.println(" extracted content: " +
extract.getAs(String.class));
}
}
}
// Delete the documents inserted by setup.
public static void teardown() {
QueryManager qm = client.newQueryManager();
DeleteQueryDefinition byDir = qm.newDeleteDefinition();
byDir.setDirectory(DIR);
qm.delete(byDir);
}
public static void main(String[] args) {
setup();
example();
teardown();
}
}
When you run the example, you should see output similar to the following:
Total matches: 2 Extracted from uri: /extract/doc1.json extracted content: {"target":"content1"} Extracted from uri: /extract/doc2.xml extracted content: <target xmlns="">content2</target>
If you add a second extract path, such as //b, then you get multiple extracted items for each matched document:
Extracted items from uri: /extract/doc1.json extracted content: {"target":"content1"} extracted content: {"b":"bar"} Extracted items from uri: /extract/doc2.xml extracted content: <target xmlns="">content2</target> extracted content: <b xmlns="">bar</b>
By varying the value of the selected
attribute of extract-document-data
, you further control how much of the matching content is returned in each ExtractedItem
. For example, if you modify the original example to set the value of selected to include-with-ancestors, then the output is similar to the following:
Extracted items from uri: /extract/doc1.json extracted content: {"parent":{"body":{"target":"content1"}}} Extracted items from uri: /extract/doc2.xml extracted content: <parent xmlns=""><body><target>content2</target></body></parent>
For more examples of how selected
affects the results, see Extracting a Portion of Matching Documents in the Search Developer's Guide.