To Find Phrases Near Each Other
We want to find only active, full-time employees. We know “active” and “full time” are near each other in the employee collection document status
property. But a simple AND
-type function would give us false matches if “active” were not near “full time”. We need to eliminate that issue.
An Optic query like this one finds two specified phrases in any order within 10 words of each other anywhere within a document in a specified collection. It is similar to the previous query except that it uses cts.nearQuery()
with the two phrases as separate parameters instead of cts.wordQuery()
, with its one phrase as a single parameter:
op.fromSearchDocs( cts.andQuery([ cts.collectionQuery('https://example.com/content/employee'), cts.nearQuery(['active', 'full time']) ])) .offsetLimit(0, 100) .result();
We used this query to retrieve a 3-column row sequence of all data from each employee-collection document containing some form of the phrase "full time" 10 words or fewer from the word "active", along with each document's URI and score, limited to 100 results:
The Data Accessor Function
fromSearchDocs()
pulls data from documents matching thects.collectionQuery()
parameter and narrowed down by other parameters into a row sequence with a unique row of these 3 columns for each matching document:uri
: Contains the document URI.doc
: Contains the document itself.score
: Contains the document’s search score, a measure of how relevant this result is with respect to other results. The higher the score, the higher the relevance.
The CTS Function
cts.andQuery()
returns the intersection of matches that each of its parameter functions finds:cts.collectionQuery()
finds all the data from documents in the specified collection, our employee collection.cts.nearQuery()
finds the word or phrase provided in the first parameter within 10 words before or after the word or phrase provided in the second parameter:Each phrase parameter has the same defaults and settings as it would in
cts.wordQuery()
.By default, “near” is within 10 words, and the phrases can be found in either order.
Other
cts.nearQuery()
parameters provide other possibilities.
The Operator Function
offsetLimit()
restricts results returned. The first parameter specifies the number of results to skip; the second, the number of results to return. So, (0, 100) returns the first 100 results.The Executor Function
result()
executes the query and returns the results as a row sequence.
Here is row 1 of the 3-column x 100-row result:
{ "uri": "/data/employees/cb31aeaa-e708-4034-a77c-ceead02ca644.json", "doc": { "GUID": "cb31aeaa-e708-4034-a77c-ceead02ca644", "Gender": "female", "Title": "Mrs.", "GivenName": "Elaine", "MiddleInitial": "D", "Surname": "Perrone", "StreetAddress": "1223 Frederick Street", "City": "Sacramento", "State": "CA", "ZipCode": "94260", "Country": "US", "EmailAddress": "ElaineDPerrone@teleworm.us", "TelephoneNumber": "916-230-4803", "TelephoneCountryCode": "1", "Birthday": "4/1/44", "NationalID": "626-03-3604", "BaseSalary": "44042", "Bonus": "4404", "Department": "Engineering", "Status": "Active - Regular Exempt (Full-time)", // Found! "ManagerGUID": "3ad0ffbc-3ade-4897-902b-718417a721f5", "point": { "lat": 38.574274, "long": -121.374583 }, "HiredDate": "2018-09-12" }, "score": 4096 }
This query returned the first 100 results as we specified in
offsetLimit()
.Using the completely lowercase parameter
active
incts.nearQuery()
defaulted the search to case insensitive, so it also found the title-cased instance of “Active”.With no parameter specifying order in the
cts.nearQuery()
, the search defaults to unordered, so it would have also matched if “Full-time” had come before “Active”.Only one result will be returned per document no matter how many times the phrase occurs within a particular document.