To Find a Phrase within Documents in a Collection
So, using fromSearchDocs()
, we could have found our full-time employees by diving straight into our documents.
An Optic query like this one finds a case-, punctuation-, and whitespace-insensitive phrase anywhere within a document of a specified collection:
op.fromSearchDocs( cts.andQuery([ cts.collectionQuery('https://example.com/content/employee'), cts.wordQuery('full time') ])) .offsetLimit(0, 100) .result();
We used this query to retrieve a 3-column row sequence of all data from each employee-collection document containing some form of the phrase "full time", along with each document's URI and score, limited to 100 results:
The Data Accessor Function
fromSearchDocs()
pulls data from documents matching thects.collectionQuery()
parameter and narrowed down by other parameters into a row sequence with a unique row of these 3 columns for each matching document:uri
: Contains the document URI.doc
: Contains the document itself.score
: Contains the document’s search score, a measure of how relevant this result is with respect to other results. The higher the score, the higher the relevance.
The CTS Function
cts.andQuery()
returns the intersection of documents matching each of its CTS-type parameters:Its first parameter,
cts.collectionQuery()
, finds all the data from documents in the specified collection: our employee collection.Its second parameter,
cts.wordQuery()
, finds the word or phrase provided:full time
.So,
cts.andQuery()
returns all the data from documents from our employee collection that contain the phrasefull time
.
The Operator Function
offsetLimit()
restricts results returned. The first parameter specifies the number of results to skip; the second, the number of results to return. So, (0, 100) returns the first 100 results.The Executor Function
result()
executes the query and returns the results as a row sequence.
Here is row 1 of the 100-row x 3-column result:
{ "uri": "/data/employees/5899d871-1261-4057-ab3e-7fea1577ba61.json", "doc": { "GUID": "5899d871-1261-4057-ab3e-7fea1577ba61", "Gender": "male", "Title": "Mr.", "GivenName": "Scott", "MiddleInitial": "M", "Surname": "Schaaf", "StreetAddress": "3586 Paradise Lane", "City": "Pomona", "State": "CA", "ZipCode": "91766", "Country": "US", "EmailAddress": "ScottMSchaaf@rhyta.com", "TelephoneNumber": "909-629-3047", "TelephoneCountryCode": "1", "Birthday": "9/25/45", "NationalID": "561-42-6126", "BaseSalary": "79460", "Bonus": "7946", "Department": "Engineering", "Status": "Active - Regular Exempt (Full-time)", // Found! "ManagerGUID": "3ad0ffbc-3ade-4897-902b-718417a721f5", "point": { "lat": 34.014225, "long": -117.843894 }, "HiredDate": "2021-11-19" }, "score": 2048 }
This query returned the first 100 results as we specified in
offsetLimit()
.Only one result will be returned per document no matter how many times the phrase occurs within a particular document.
A common practice is to add
orderBy(op.desc(score))
to order by score from most to least relevant result.