This chapter describes the new features in MarkLogic 9.
MarkLogic 9 enables you to define a relational lens over your document data, so you can query parts of your data using SQL or the new Optic API. Templates let you specify which parts of documents make up rows in a view. You can also use templates to define a semantic lens, specifying which values from a document make up triples in the triple index.
For more details, see Template Driven Extraction (TDE) in the Application Developer's Guide.
MarkLogic is a NoSQL database, where the unit of storage and indexing is a document. The document model makes it possible to express rich, related, varying structures -- anything from a scientific journal article to a complex derivative trade. Many users want to view parts of these rich structures as though they were simple tables -- to see the data in those documents through a relational lens. While it is possible to create and query SQL views with MarkLogic 8, MarkLogic 9 further enhances the SQL capabilities with a number of new features, including Templates, the new Optic API, and an updated ODBC driver.
In MarkLogic 9, you can define a template that specifies which parts of the document make up a row in a view, and then query that view from a server-side program with xdmp:sql()
or xdmp.sql()
or via ODBC. You can also query that view server-side from the new MarkLogic Optic API -- a fluent JavaScript interface with the ability to perform joins and aggregates on views over documents.
The Optic API blends the relational world with rich NoSQL document features by providing the capability to perform joins and aggregates over documents. One of the enabling features, Template Driven Extraction, makes it possible to create a relational lens over documents stored in MarkLogic by using templates to specify the parts of a document that make up a row in a view. You can access that data using the Optic API in XQuery, JavaScript, or Java.
For more details, see Optic API for Multi-Model Data Access in the Application Developer's Guide.
MarkLogic 9 adds more flexibility to the Tiered Storage feature. First, it allows documents to be matched to a tier based on a query, in addition to the current solution that is based on a range index pair of boundary values. That means that any complex query can be used to create a policy to match data or metadata about documents in order to decide in which tier to save or move the document. Second, it allows tiered storage queries to match dates using age, for instance, you can write a query for high-end tier that matches documents created in the last 30 days, and another for mid-end tier that matches documents created between 1 and 3 years, rather than the fixed pair of boundary values in the current solution. Third, it removes the need of using super databases to create tiers, simplifying the life of developers, which were required to understand the tiering policy in order to write high performance queries or any update.
MarkLogic 9 adds more performance to the Tiered Storage feature. Now queries that use elements which are part of the tiering policy are optimized to run against the nodes that are more likely to have that data. That means that if you have a query that uses the same element as the tiering policy, for instance an element range query on create date > 30 days, the query engine will direct the search only to the data nodes that store that data, in this example high-end tier nodes.
MarkLogic 9 introduces the ability to encrypt data at rest - data that is on media (on disk or in the cloud), as opposed to data that is being used in a process. Encryption can be applied to newly created files, configuration files, or log files. Existing data files can be encrypted by triggering a merge or re-index of the data.
For more information about using Encryption at Rest, see Encryption at Rest in the MarkLogic Security Guide.
If a node in your cluster is offline for any reason, wait until the host comes back online to make any changes to your encryption at rest settings. Do not change your encryption settings while a host is offline.
Element Level Security is an addition to the MarkLogic security model that allows you to specify more complex security rules for specific elements within documents. Element Level Security can be applied to either JSON or XML documents. Users without the appropriate permissions cannot view the secured element or JSON property.
Element Level Security can be used in addition to the existing document level security and compartment security. For details about using Element Level Security, see Element Level Security in the Security Guide.
MarkLogic 9 introduces a new library module and mlcp command line option that enable you to redact sensitive data when extracting documents from the database. Redaction enables you to obscure or hide portions of a document using a rule-based read transformation.
Redaction is available through the following interfaces:
-redaction
command line option.For more details, see Redacting Document Content in the Application Developer's Guide and Redacting Content During Export or Copy Operations in the mlcp User Guide.
MarkLogic 9 includes the following enhancements to geospatial operations:
You can take advantage of these enhancements using all of the MarkLogic geospatial and search related interfaces, including cts:search, search:search, JSearch, and the REST, Java, and Node.js Client APIs.
The performance of geospatial region queries for geographic coordinate systems (WGS84 and ETRS89) has not yet been fully optimized. Performance will be improved in a future release.
For details, see Geospatial Search Applications in the Search Developer's Guide.
The Entity Services API enables you to quickly and easily model your business entities and relationships between them and then generate code and configuration artifacts that provide a framework for an entity based application.
The artifacts you generate with the Entity Services API make it easier to create entities from raw source data, query entities and the relationships between them, and manage your entities and models.
For details, see the Entity Services Developer's Guide.
In MarkLogic 9 the default tokenization and stemming code has been changed for all languages (except English tokenization). Some tokenization and stemming behavior will change between MarkLogic 8 and MarkLogic 9. We expect that in most cases results will be better in MarkLogic 9.
You can now customize stemming and tokenization for a language by selecting one of several built-in stemmer and lexer plug-ins, or by creating your own stemmer and lexer plug-ins in C++.
You can also configure a custom stemming and/or tokenization dictionary for any language. Previously, you could not distinguish between stemming and tokenization dictionaries, and you could not install a dictionary for an unsupported language.
For more details, see the following topics:
MarkLogic 9 adds three new cts:query constructor options for controlling wildcard expansion.
When evaluating a wildcarded search term, MarkLogic must sometimes make a tradeoff between speed and accuracy. MarkLogic attempts to expand a wildcard term from the lexicon. To prevent this expansion from taking too long, there is a limit on how many words MarkLogic will extract from the lexicon. When the limit is reached, MarkLogic falls back on alternative strategies that accurate for all possible terms containing wildcard characters.
For many searches, these inaccuracies are an acceptable tradeoff for the fast response of an unfiltered search. Where the inaccuracies are not acceptable, you can use the following new options to change the default behavior: -lexicon-expansion-limit=
N, -limit-check
, -no-limit-check
.
For more details, see the function reference documentation for the following functions:
The following capabilities have been added to the mlcp command line tool:
In addition the mlcp source code is now available for download through the GitHub marklogic-contentpump project. For details, see Accessing the mlcp Source Code in the mlcp User Guide.
The mlcp command line tool can now connect to MarkLogic via an SSL (Secure Socket Layer) connections. For details, see Connecting to MarkLogic Using SSL in the mlcp User Guide.
You can now specify multiple hosts for mlcp to connect to during import
, export
, and copy
jobs. Used by itself, this feature enables mlcp to fall back an alternative host if the initial host is not available.
You can also use this capability in conjunction with the new -restrict_hosts
option to prevent mlcp from connecting to any hosts except the ones on the initial host list.
For more details, see Controlling How mlcp Connects to MarkLogic in the mlcp User Guide.
MarkLogic 9 introduces the ability to redact sensitive data when extracting documents from the database with the export or copy commands by specifying redaction rule collections in the -redact option.
For more details, see Redacting Content During Export or Copy Operations in the mlcp User Guide and Redacting Document Content in the Application Developer's Guide.
Previously, the batch size was always one when applying a server-side transformation during an mlcp import
or copy
job. This restriction has been lifted in MarkLogic 9. All documents in a batch are now transformed and inserted into the database as a single statement, greatly improving performance when using a transformation.
Collections, permissions, document quality, and temporal collection specified by the client are now available to a transformation function via the context parameter. In addition, a transformation function can set collections, permissions, quality, temporal collection, and values metadata for its output document(s).
For details, see Transforming Content During Ingestion in the mlcp User Guide.
The following capabilities have been added to the Java Client API:
The new Data Movement SDK feature of the Java Client API enables you to move large amounts of data into, out of, or within a MarkLogic cluster asynchronously. These interfaces leverage your entire cluster for scale-out performance. The feature supports streaming, asynchronous, long running data movement tasks.
For details, see the com.marklogic.client.datamovement
package in the Java Client API Documentation and Asynchronous Multi-Document Operations in the Java Application Developer's Guide.
You can now pass a temporal document URI when creating, updating, and patching temporal documents. This is logical document URI in a temporal collection. For more details, see the methods of com.marklogic.client.bitemporal.TemporalDocumentManager
that accept a temporalDocumentURI
parameter and Working with Temporal Documents in the Developing Applications With the Java Client API.
If you have sufficient privileges, you can now wipe (completely remove) a temporal document using TemporalDocumentManager.wipe
.
You can use the Java Client API to protect a temporal document against update, deletion, or wipe for a specified time. See TemporalDocumentManage.protect
.
You can now use Kerberos or certificate-based authentication to authenticate with MarkLogic. You can also connect to MarkLogic via a SSL (Secure Socket Layer) connection.
For more details, see Authentication and Connection Security in the Developing Applications With the Java Client API.
MarkLogic 9 adds the ability to associate key-value metadata with a document. You can use the Java Client API to add, update, delete, and search values metadata. For more details, see Values Metadata in the Developing Applications With the Java Client API.
You can use the Java Client API to execute a plan produced by the Optic API and receive row-based results in your Java application. For details, see Optic Java API for Relational Operations in the Developing Applications With the Java Client API.
MarkLogic 9 includes enhancements to geospatial search such as region searches, double precision coordinates, additional coordinate systems. These features are exposed through the Java Client API. For more details, see Geospatial Enhancements, Creating Region Queries Using the Client APIs in the Search Developer's Guide, and StructuredQueryBuilder.geospatial
in the Java Client API Documentation.
The following capabilities have been added to the Node.js Client API:
You can now use Kerberos or certificate-based authentication to authenticate with MarkLogic. You can also connect to MarkLogic via a SSL (Secure Socket Layer) connection.
For more details, see Authentication and Connection Security in the Node.js Application Developer's Guide.
You can now pass a temporal document URI when creating, updating, and patching temporal documents. This is the logical document URI in a temporal collection. For more details, see the methods in the documents
namespace that accept a temporalDocument
parameter and Working with Temporal Documents in the Node.js Application Developer's Guide.
If you have sufficient privileges, you can now wipe (completely remove) a temporal document using TemporalDocumentManager.wipe
.
MarkLogic 9 adds the ability to associate key-value metadata with a document. You can use the Node.js Client API to add, update, delete, and search values metadata. For more details, see the metadataValues
metadata category in Working with Metadata in the Node.js Application Developer's Guide.
MarkLogic 9 includes enhancements to geospatial search such as region searches, double precision coordinates, additional coordinate systems. These features are exposed through the Node.js Client API. For more details, see Geospatial Enhancements, Creating Region Queries Using the Client APIs in the Search Developer's Guide, and queryBuilder.geospatialRegion
in the Node.js Client API Reference.
The queryBuilder.near method now accepts a minimum distance for near queries. Previously, you could only specify a maximum distance. For details, see the Node.js Client API Reference and cts:near-query.
The following capabilities have been added to the REST Client API:
You can now pass a temporal document URI when creating, updating, and patching temporal documents. This is logical document URI in a temporal collection. For more details see Working with Temporal Documents in the REST Application Developer's Guide and the /v1/documents
methods that accept a temporal-document
parameter.
If you have sufficient privileges, you can now wipe (completely remove) a temporal document using DELETE /v1/documents?result=wiped
. For more details, see DELETE /v1/documents in the MarkLogic REST API Reference.
The new POST /v1/documents/protection method enables you to protect a temporal document against update, deletion, or wipe for a specified time period. For more details, see the MarkLogic REST API Reference.
The new /rows
service enables you to execute a plan produced by the Optic API and receive results in a variety of row-based formats. For more details, see:
Most read and search operations now accept a timestamp request parameter that enables you to make successive requests that will evaluated against the state of the database at a point in time. The timestamp can be obtained from the ML-Effective-Timestamp header returned by the same methods.
For more details, see Performing Point-in-Time Operations in the REST Application Developer's Guide.
You can use a serialized cts:query in place of a structured query in GET /v1/search and POST /v1/search. For more details, see Searching With cts:query in the REST Application Developer's Guide and the MarkLogic REST API Reference.
MarkLogic 9 adds the ability to associate key-value metadata with a document. You can use the REST Client API to add, update, delete, and search values metadata. You can use the new metadata category metadata-values
when working with this type of metadata.
For more details, see Working with Metadata in the REST Application Developer's Guide.
Telemetry is part of our continuous effort to provide better and faster support by automating the data collection process required on most support tickets. When Telemetry is enabled, it collects, encrypts, and sends diagnostic and anonymized system-level information about a MarkLogic cluster to a secure MarkLogic destination.
The Telemetry feature collects only system level information, and sends it to a protected and secure location where it can only be accessed by the MarkLogic technical teams, to be used to facilitate troubleshooting and monitor performance. Telemetry does not collect any personally identifiable information, user data or application logs. See Telemetry in the Monitoring MarkLogic Guide for more information.
MarkLogic now supports selected features from the XQuery 3.0 and XQuery 3.1 specifications. These features are only available when using the 1.0-ml XQuery dialect.
For more details, see XQuery 3.x Features in the XQuery and XSLT Reference Guide.
MarkLogic 9 includes the following new features and enhancements in the Query Console application. For more details, see the Query Console User Guide.
Administrators can now customize the display environment of MarkLogic applications such as Query Console and the Monitoring Dashboard in the following ways:
For more details, see Configuring a MarkLogic Application Message and Banner in the Administrator's Guide.
A rolling upgrade is one way to address the need for highly available clusters under heavy transaction loads to upgrade to a newer version of MarkLogic in a seamless manner. Hosts in a cluster are upgraded one by one, without incurring any downtime in availability or interruption of transactions.
For more details, see Rolling Upgrades in the Administrator's Guide.
The MarkLogic bi-temporal data management feature has been enhanced to provide the option to store valid and system axes and archival information outside of temporal documents in metadata, rather than directly in the documents. Storing the axes times in metadata enables MarkLogic to update the axes timestamps without changing the documents and invoking reindexing.
The ability to update nodes in temporal documents has been added.
Temporal documents can be protected from certain temporal operations, such as update, delete or wipe for a specified period of time and then automatically archived to a WORM (Write Once Read Many) device.
MarkLogic also supports uni-temporal documents that have only a system axes.
Secure credentials enable a security administrator to manage credentials and to be made available to less privileged users for authentication to other systems without giving them access to the credentials themselves.
Certificate-based authentication requires internal and external users and HTTPS clients to authenticate themselves to MarkLogic via a client certificate, either in addition to, or rather than a password.
In MarkLogic 9, a number of REST Management APIs have been added. A few of the APIs have been updated as well. This table summarizes the changes.
MarkLogic version 9 supports IPv6 networking, including client connectivity to MarkLogic endpoints. All MarkLogic features and capabilities work seamlessly with IPv6 in addition to working with the IPv4 protocol.
The IPv6 protocol provides virtually limitless address space for use by government, universities, and private sector companies in accessing Internet services.
MarkLogic version 9.0-2 contains the following new features:
MarkLogic 9.0-2 contains the following enhancements related to geospatial region support:
You can specify additional units (feet, kilometers, meters) when creating a geospatial region index. Previously, you could only specify miles. The units are only meaningful when using a geodetic coordinate system such as WGS84.
For more details, see admin:database-geospatial-region-path-index in the MarkLogic XQuery and XSLT Function Reference.
You can now use the crosses, equals and touches operators for geospatial region queries and other region computation functions that previously accepted region comparison operators.
For example, you can uses crosses as an operator name to functions such as cts:geospatial-region-query (XQuery) or cts.geospatialRegionQuery (JavaScript). You can also use DE9IM_CROSSES
in query text passed to cts:parse (XQuery) or cts.parse (JavaScript).
The following new functions are available for comparing geospatial region values:
For more details, see the function reference documentation in the MarkLogic XQuery and XSLT Function Reference or the MarkLogic Server-Side JavaScript Function Reference.
Tolerance is the largest allowable variation in geometry comparisons. If the distance between two points is less than tolerance, then the two points are considered equal. For more details, see Understanding Tolerance in the Search Developer's Guide.
The following functions now support a tolerance option. For more details, see the function reference documentation.
You can use a serialized cts:query in place of a structured query in the following methods:
On the GET methods, specify the cts:query as the value of the structuredQuery request parameter. On the POST methods, put the cts:query in the POST body.
In addition, you can include a cts:query
serialized as XML or JSON in place of a structured query in a combined query. This is applicable to any method that accepts a combined query input, such as a POST request to /v1/search
or /v1/values/{name}
.
For more details, see Searching With cts:query in the REST Application Developer's Guide and the MarkLogic REST API Reference.
You can now include a Query By Example (QBE) in a combined query when using the REST Client API. For more details, see Specifying Dynamic Query Options with Combined Query in the REST Application Developer's Guide.
You can now configure the commit mode (auto or explicit) and transaction type (query, update, or auto) independently when configuring a new transaction. This change manifests in the following ways:
xdmp:update
XQuery prolog option accepts a new value, auto, which specifies that MarkLogic should determine the transaction/statement type (query or update) through static analysis. The pre-existing value false now means the transaction/statement type is query. Use this option plus xdmp:commit instead of the now-deprecated xdmp:transaction-mode prolog option.xdmp:update
instead of the now-deprecated xdmp:transaction-mode
prolog option.commit
and update
options have been added to the functions listed in the table below. Use these in preference to the transaction-mode option
, which has been deprecated.The following functions support the new commit and update options. For more details, see the function reference documentation for xdmp:eval (XQuery) or xdmp.eval (JavaScript).
XQuery | JavaScript |
---|---|
xdmp:eval | xdmp.eval |
xdmp:javascript-eval | xdmp.xqueryEval |
xdmp:invoke | xdmp.invoke |
xdmp:invoke-function | xdmp.invokeFunction |
xdmp:spawn | xdmp.spawn |
xdmp:spawn-function |
For more details on the new capabilities, see Understanding Transactions in MarkLogic Server in the Application Developer's Guide.
For details on transitioning from the old transaction controls to the new ones, see the following topics:
The following methods have been added to the Session
class for configuring transactions and querying transaction configuration:
You should use these methods rather Session.setTransactionMode
, which has been deprecated. For details, see XCC Session.setTransactionMode is Deprecated.
Session.setAutoCommit
controls whether requests submitted during the session run in a transaction with auto-commit semantics (the default) or explicit commit semantics. Executing a request with commit set to explicit starts a multi-statement transaction.
Session.setUpdate
controls whether requests submitted during the session run in a query transaction, an update transaction, or if the transaction type should be automatically detected by MarkLogic through analysis of the submitted code. Auto detection is the default behavior.
Note that if you override the Session
transaction configuration in an ad hoc query, the behavior differs depending on whether you configure the session using setTransactionMode
or setAutoCommit
and setUpdate
. With setAutoCommit
and setUpdate
, the transaction configuration reverts to the Session
settings once the transaction involving the override completes. With setTransactionMode
, the override persists and affects future transactions unless you explicitly change it.
MarkLogic version 9.0-3 contains the following new features:
You can use the following new method of the REST Management API to advance LSQT on a temporal collection:
POST /manage/v2/databases/{id|name}/temporal/collections?collection=collname
For more details, see POST /manage/v2/databases/{id|name}/temporal/collections?collection={name} in the MarkLogic REST API Reference.
You can now use the following new method of the REST Client API to advance LSQT on a temporal collection:
POST /v1/temporal/collections/{name}
For more details, see POST /v1/temporal/collections/{name} in the MarkLogic REST API Reference.
Ops Director presents a consolidated view of your MarkLogic infrastructure via dashboards that streamline monitoring and troubleshooting of clusters with alerting, performance, and log data. For details, see the Ops Director Guide.
MarkLogic is now available for deployment on Amazon Web Services (AWS) by means of AWS 1-Click. For details, see http://developer.marklogic.com/products/cloud/aws.
The Entity Services API offers the following additional capabilities as of MarkLogic 9.0-3:
You can now work with either XML or JSON envelope documents. Previously, only XML was supported.
As part of this change, some of the functions generated in instance converter modules have been extended to accept a format parameter. If you omit the format, these functions generate XML instances and envelopes, as they did prior to MarkLogic 9.0-3.
The following generated function interfaces now accept an optional format parameter, for some entity type T.
If you want to use JSON envelopes, you should regenerate your instance converter and version translator code.
For more details, see Entity Instance Concepts in the Entity Services Developer's Guide.
An entity type can now include specifications for entity properties to be backed by an element range index. Previously, you could only specify a path range index.
In an XML model descriptor, use the path-range-index
and element-range-index
elements to specify entity properties to be backed by an index. The path-range-index
element is identical to the pre-existing range-index
element.
In a JSON model descriptor, use the pathRangeIndex
and elementRangeIndex
properties to specify entity properties to be backed by an index. The pathRangeIndex
property is identical to the pre-existing rangeIndex
property.
For more details, see Identifying Entity Properties for Indexing in the Entity Services Developer's Guide.
An entity type definition can now include a namespace URI and namespace prefix for XML instances of that type. The namespace binding is used when generating entity instances in XML, and when generating index configuration and query option artifacts.
For more details, see Defining a Namespace URI for an Entity Type in the Entity Services Developer's Guide.
The instance converter and version translator code you can generate using Entity Services has been refactored to improve readability and ease of customization, and to accommodate both XML and JSON envelopes. Previously generated code will continue to work as-is.
For more details, see Generating Code and Other Artifacts in the Entity Services Developer's Guide.
The Search API and REST, Java, and Node.js Client APIs now support additional search result ordering choices through the sort-order
query option.
Previously, you could only specify score ordering. You can now sort based on score, fitness, quality, and more. For details, see sort-order in the Search Developer's Guide.
A new built-in redaction function, redact-number
, is available as of MarkLogic 9.0-3. This function provides fine-grained control over the type, range, and format of masking values for numeric data. You can use redact-number
with mlcp's redaction capability, the rdt:redact XQuery function, or the rdt.redact Server-Side JavaScript function.
For details, see redact-number
in the Application Developer's Guide.
Performance improvements have for Client API server-side transformations and extensions implemented in Server-Side JavaScript. This affects applications using transformations or extensions with the REST, Java, or Node.js Client APIs.
Your applications will only benefit from the improved performance if you reinstall your Server-Side JavaScript transforms and extensions.
As of Java Client API version 4.0.3, you can use a serialized cts:query as an additional constraint on a values or tuples query.
MarkLogic 9.0-4 introduces the following new features:
As of MarkLogic 9.0-4, the mask-deterministic
built-in redaction function supports two new options for specifying a salt value for masking value generation. Salting can provide a higher level of security. For more details, see the discussion of the salt
and extend-salt
options of mask-deterministic in the Application Developer's Guide.
Note that the introduction of these options changes the default behavior of deterministic masking value generation. For details, see Redaction: Deterministic Masking Values Differ.
As of MarkLogic 9.0-4, you can use the redact-datetime
built-in redaction function in your redaction rules. You can mask a dateTime value with a random dateTime value or using a picture string.
For more details, see redact-datetime
in the Application Developer's Guide.
As of MarkLogic release 9.0-4, MarkLogic converters/filters are offered as a separate package (called MarkLogic Converters package) from MarkLogic Server package.
This change provides better flexibility and enables you to install/uninstall MarkLogic converters/filters separately from MarkLogic Server.
For more details, see MarkLogic Converters Installation Changes Starting at Release 9.0-4 in the Installation Guide.
The following new features are available as of Node.js Client API v2.1.1. Unless otherwise noted, MarkLogic 9 is required to take advantage of these changes.
The partial update or patch feature of the REST and Java Client APIs now support replacement content constructor functions implemented in Server-Side JavaScript. Versions of MarkLogic prior to 9.0-4 only support XQuery implementations.
For more details, see Writing an XQuery User-Defined Replacement Constructor in the REST Application Developer's Guide.
As of MarkLogic release 9.0-4, you can restore a database from a backup, even if the number of database forests are asymmetrical to the backup forests. As a result, the Admin Interface and related API database restore functions have been changed.
For details, see Restoring a Reconfigured Database in the Administrator's Guide.
MarkLogic now supports the 1-click launch option in AWS Marketplace. Because of this, the published MarkLogic AMIs will have data volume predefined.
The following features impose restrictions on XPath expressions used in their configuration. The existence of the restrictions is unchanged. However, as of MarkLogic 9.0-4, the restrictions have been more formally defined for each feature and somewhat relaxed for some features.
The following list summarizes how the restrictions have changed for affected features. In all cases, the changes enable a larger subset of XPath and do not introduce backward incompatibilities.
For more details, see Restricted XPath in the XQuery and XSLT Reference Guide.
As part of MarkLogic 9.0-4, Element Level Security now includes the protected path set feature. A protected path set is a way to allow multiple protected paths to cover and secure the same element, with both AND and OR relationships between the permissions. The information (the name of the protected path set) is simply a tag on the protected path definition. This enables multiple arbitrary security markings for an element.
For more details, see Protected Path Sets in the Security Guide.
MarkLogic 9.0-5 introduces the following new features:
MarkLogic 9.0-5 introduces full support for MarkLogic Data Hub Framework (DHF) version 3.0. DHF is an Open Source data integration framework and set of tools that enable you to quickly integrate data from multiple sources into a single MarkLogic database, and then expose that integrated data through MarkLogic.
DHF includes both client-side libraries and tools for integration and modeling, and server-side framework support. To learn more about DHF, see the following:
To use DHF, you will need DHF version 3.0 or later, MarkLogic 9.0-5 or later, and Oracle Java 8 JRE (client-side). For details, see the DHF documentation.
MarkLogic 9.0-5 introduces new built-in and library functions for entity enrichment and extraction. The new interfaces use entity dictionaries to identify entities. You can create an entity dictionary in several ways, including from a graph created from a SKOS ontology. For more details, see Entity Extraction and Enrichment in the Search Developer's Guide.
Use the following new functions for entity enrichment and extraction:
XQuery | Server-Side JavaScript |
---|---|
entity:enrich |
entity.enrich |
entity:extract |
entity.extract |
cts:entity-highlight |
cts.entityHighlight |
cts:entity-walk |
cts.entityWalk |
Use the following new functions for creating and managing entity dictionaries:
You can now apply a URI filter to the documents listed in the Explorer view. You can specify a single URI (/my/interesting/document.xml) or a wildcard expression (/my/interesting/*.xml). For more details, see Filtering the Explorer View by URI in the Query Console User Guide.
MarkLogic 9.0-5 introduces new library functions for configuration management.
The configuration management functions can be used to:
Use the following new functions for configuration management:
XQuery | Server-Side JavaScript |
---|---|
cma:generate-config |
cma.generateConfig |
cma:apply-config |
cma.applyConfig |
For more details, see MarkLogic XQuery and XSLT Function Reference and MarkLogic Server-Side JavaScript Function Reference.
MarkLogic 9.0-5 introduces a new REST API for configuration management: Configuration Management API (CMA).
The Configuration Management API is a RESTful API that allows retrieving, generating, and applying configurations for MarkLogic clusters, databases, and application servers.
Use the following new endpoints for configuration management:
For more details, see MarkLogic REST API Reference.
MarkLogic 9.0-5 includes the new version of the Ops Director application: Ops Director 1.1-1.
The Ops Director 1.1-1 has the following new features and enhancements:
For more details, see the Ops Director Guide.
The Monitoring History application in MarkLogic 9.0-5 was enhanced with new metrics for XDQP Server Requests performance. For details, see XDQP Server Requests Performance Data in the Monitoring MarkLogic Guide.
As of MarkLogic 9.0-5, geospatial region queries support a tolerance option.
Tolerance is the largest allowable variation in geometry comparisons. If the distance between two points is less than tolerance, then the two points are considered equal. For more details, see Understanding Tolerance in the Search Developer's Guide.
This change affects the functions cts:geospatial-region-query (XQuery), cts.geospatialRegionQuery (JavaScript), the geo-option
portion of a Search API geo-region-path constraint option, the geo-option
portion of a geo-region-path-query
or geo-region-constraint-query
, and the options you can specify in a region query using the Java Client API and Node.js Client API.
Due to the improved tolerance support, when upgrading to MarkLogic 9.0-5 or later from an earlier version of MarkLogic 9, you might not get accurate results for geospatial region queries until you reindex.
MarkLogic 9.0-5 and later versions enable you to more easily enable/disable or suspend/resume database replication.
Disabling database replication changes the database configuration, so the disabled state persists across restarts and failovers. Use suspend/resume when lag is high and you need to stop database replication quickly.
Suspending database replication for a forest does not change the configuration, so replication resumes after a restart or failover. Use enable/disable when you want to stop replication for a long period, such as when you move a replica site.
This capability is exposed in the following ways:
xdmp.forestDatabaseReplicationSuspend
and xdmp.forestDatabaseReplicationResume
.If one or more host forests are configured for local-disk or shared-disk failover, you now have the option to failover those forests when you shut down the host. A new option, Immediately fail over forests to replica hosts, has been added to the Host Shutdown confirmation page to enable you to fail over the forests to replica hosts.
MarkLogic 9.0-6 introduces the following new features:
In MarkLogic 9.0-6, when switching from an external KMS to an internal KMS, encryption at rest does not limit the files that can be decrypted to only those encrypted with keys that are in the internal KMS. MarkLogic can decrypt any file encrypted with the internal MarkLogic KMS or an external KMS.
It is no longer necessary to decrypt your data before transitioning files encrypted with an external KMS to the internal MarkLogic KMS. In MarkLogic 9.0-5, you needed to decrypt all your data before switching from an external KMS back to internal KMS. This is no longer necessary.
In MarkLogic 9.0-6, you can now specify multiple hosts, multiple ports, and multiple KMIP credentials to connect to KMIP servers. MarkLogic can store more than one set of external KMIP server credentials, to be used in encrypting and decrypting data. In the case where a KMIP server is unavailable and the first specified KMIP server stops responding, MarkLogic will try to connect to each of the other hosts in the user-specified list until it succeeds.
MarkLogic 9.0-6 includes refinements to tolerance support in geospatial region queries. Some tolerance related edge cases now produce more accurate results. You can only benefit from this change if you reindex.
If your region queries do not involve polygons that fall into these edge cases, you will not see any change in geospatial region query results even if you do reindex.
Starting with MarkLogic 9.0-6, forms submitted with blank fields result in field entries with blank string content (for example: ""). Previously, these empty form fields would not result in any field entry.
MarkLogic 9.0-7 introduces the following new features:
In MarkLogic 9.0-7, you can now enable a user to create, reindex, update properties, run operations on, or delete a database if they have the manage
role and one of the following granular privileges:
http://marklogic.com/xdmp/privileges/admin/database/
database-ID
http://marklogic.com/xdmp/privileges/admin/database/
activity/
database-ID
In MarkLogic 9.0-7, you can now enable a user to create, update properties, run operations on, or delete a forest if they have the manage
role and one of the following granular privileges:
The database Explorer in Query Console now enables you to do the following operations without writing any code:
For more details, see Editing Database Content in the Query Console User Guide.
The Request Monitoring feature enables you to configure logging of information related to requests, including metrics collected during request execution. This feature lets you enable logging of internal preset metrics for requests on specific endpoints. You can also log custom request data by calling the provided Request Logging APIs. This logged information may help you evaluate server performance.
For more details, see Endpoints and Request Monitoring in the Query Performance and Tuning Guide<Default ° Font>.
In MarkLogic 9.0-7 you can now access both Query Service and Data Hub Service as services on AWS. Query Service enables you to respond to varying query workloads for your MarkLogic cluster. This elastic query service eliminates the need to set up and manage the underlying infrastructure required to scale for capacity. MarkLogic Query Service work by attaching MarkLogic e-nodes to to your existing MarkLogic cluster, and auto-scaling as your workload requires. For more information, see https://cloudservices.marklogic.com/and https://www.marklogic.com/product/marklogic-database-overview/query-service/.
MarkLogic Data Hub Service is a cloud service providing the features of the Data Hub Framework in a managed instance available on AWS. The Data Hub Service enables you to create an Operational Data Hub in the cloud. The service is designed to be transparent, scaleable, and easy to deploy. For more details, see https://www.marklogic.com/product/marklogic-database-overview/data-hub-service/. To learn more about Data Hub Framework see the Data Hub Framework website.
As of Java Client API 4.1.1, you can specify a connection type when creating a DatabaseClient
. The connection type defaults to DIRECT
, meaning that direct connections to hosts in your MarkLogic cluster are possible. Specify the GATEWAY
connection type when connecting to MarkLogic through a load balancer. A GATEWAY
connection is required when using the Data Movement SDK with a load balancer.
For more details, see Connecting Through a Load Balancer in the Java Application Developer's Guide.
MarkLogic 9.0-8 introduces the following new features:
The following features have been added to Request Monitoring:
The Request Monitoring feature now enables you to setup and enable request monitoring for an endpoint, a main module, or globally on an XDBC Server.
The Request Cancelling feature enables you to setup and enable request cancelling for an endpoint, a main module, or globally on an App Server or XDBC Server.
The following functions have been added to the Request Monitoring API:
For more details, see Endpoints and Request Monitoring in the Query Performance and Tuning Guide<Default ° Font>.
The following meters are collected on the server as of MarkLogic Server 9.0-8 and are available in *.api
files, request logs, and in the output of xdmp:query-meters()
:
Customers who use MarkLogic's Encryption at Rest feature can now synchronize the enveloped keys on MarkLogic with their KMS. For more information, see Synchronizing the KMS Keys in the Security Guide or documentation for the function xdmp:keystore-synchronize.
The following XQuery functions have been added to MarkLogic Server:
The following JavaScript functions have been added to MarkLogic Server:
See the online documentation provided in the links for details.
MarkLogic 9.0-8 has the parameter ldap-remove-domain
, enabling you to choose whether or not to remove the domain name. The default value is false
.
MarkLogic 9.0-8 now has a Logout button in the upper right corner of the Admin Interface.
MarkLogic 9.0-9 introduces the following new features:
In MarkLogic 9.0-9, the ldap-remove-domain
parameter default value has been changed to true
to keep behavior consistent with MarkLogic 9.0-6 and earlier.
MarkLogic 9.0-9 supports endpoint monitoring at the Task Server level. For more information, see the Query Performance and Tuning Guide.
Starting in MarkLogic 9.0-9, ORDER-BY
ordering on NULL is set to NULLS LAST
. This changes default behavior for nulls in query results and therefore may cause backwards incompatibility for certain queries (unless the Optic Nulls Smallest On
trace event is turned on).
An ldapSearch
call in JavaScript now returns an array if there are multiple values associated with a key. Previously it only returned a single value for every key, even if there were multiple values associated with that key.
MarkLogic Amazon Web Services (AWS) AMIs are now supported on Amazon Linux 2. This is the recommended base image on which to deploy customized MarkLogic images.
With MarkLogic 9.0-9, MarkLogic supports the Thales nCipher nShield Connectd for Encryption at Rest for both Windows and Linux platforms. Thales HSM (Hardware Security Module) implements PKCS #11 (the IETF standard) and when configured, replaces our internal keystore (SoftHSM) for all encryption operations.
MarkLogic 9.0-9.1, ECDH is a supported cipher for SSL/TLS communication. SSL/TLS works if a ECDH cipher is specified.
MarkLogic 9.0-10 introduces the following new features:
OpenSSL has been upgraded to version 1.0.2s. For more information, please see the list of changes here.
Updates to the BiText Dictionaries:
In most cases, new entries/words come from neologisms or newly-coined words that have a significant number of occurrences in corpus (always including any necessary inflection). In some cases, new entries/words come from previously missing words (normally due to too low occurrences in corpus in previous iterations).
For English we have made a wider change: in our previous version, comparative and superlative forms of adjectives were their own lemmas (so, the lemma for nicest was nicest, and the lemma for nicer was nicer). Now we have linked those forms to the positive form, with that positive form as lemma (so, in that new version, the lemma for nicest and nicer is nice). Anyway, as we weren't sure how this change could affect to your operations, we have prepared two versions of the updated English data:
oldCS.txt, containing the comparatives and superlatives of adjectives with the comparative or superlative as lemma.
newCS.txt, containing the comparatives and superlatives of adjectives with the positive form as lemma.
Below are the details of the changes per language (amount of newly added words):
English (ENG): Around 4,500 new entries added (89,248 ® 93,827).
Arabic (ARB): Around 5,500 new entries added (24,060,893 ® 24,066,479).
Dutch (NLD): Around 25,000 new entries added (376,185 ® 401,741).
French (FRA): Around 700 new entries added to the non-clitics dictionary (375,713 ® 376,449). Around 250 new entries added to the clitics dictionary (985,407 ® 985,676).
German (DEU): Around 26,000 new entries added (1,132,410 ® 1,158,531).
Italian (ITA): Around 900 new entries added (3,320,525 ® 3,321,417).
Korean (KOR): Around 20,000 new entries added (5,013,103 ® 5,032,966).
Norwegian (NOR): Around 400 new entries added to the Bokm•l dictionary (NOB) (378,738 ® 379,112). Around 28 entries added to the Nynorsk dictionary (NNO) (328,872 ® 329,004).
Persian (FAS): Around 1,200 new entries added (403,038 ® 404,283).
Portuguese (POR): Around 2,000 new entries added (10,861,645 ® 10,863,756).
Russian (RUS): Around 1,200 new entries added to the standard dictionary (1,447,917 ® 1,449,175). 16 entries added to the orthographic yo/ye dictionary (61,236 ® 61,252).
Spanish (ESP): Around 7,500 new entries added (3,317,323 ® 3,324,939).
Swedish (SWE): Around 1,000 new entries added (424,824 ® 425,862).
MarkLogic 9.0-11 introduces the following new features:
The logging for each request can be controlled by adding a threshold. The request information is logged only when the threshold conditions are met.
Request monitoring is now enabled by default on some MarkLogic application servers:
For more information, see the MarkLogic Query Performance and Tuning Guide.
Microsoft Azure Key Vault can encrypt your data in MarkLogic. Azure Key Vault is supported for customers running their cluster on Microsoft Azure. For more information, see the MarkLogic Security Guide.
MarkLogic 9.0-12 introduces the following new features:
MarkLogic 9.0-12 now has an Upgrade tab in the Admin Interface. During an upgrade, click the Upgrade tab to view the upgrade status of each host in the cluster. For more details, see Rolling Upgrade Status in Admin UI in the Administrator's Guide.
Swap space is automatically configured when running MarkLogic Server on Amazon Web Services (AWS). Swap space is configured during the system startup process with the MARKLOGIC_AWS_SWAP_SIZE configuration variable. For more details, see AWS Configuration Variables and Deployment and Startup in the MarkLogic Server on Amazon Web Services (AWS) Guide.
The Managed Custer feature supports SSL-enabled clusters. For details, see The Managed Cluster Feature in the MarkLogic Server on Amazon Web Services (AWS) Guide.
MarkLogic 9.0-13 introduces the following new features:
Updated the list of packages required for each supported Linux platform. For more details, see Supported Platforms and Appendix: Packages by Linux Platform in the Installation Guide for All Platforms.
Updated the minimum required IAM permissions to create and delete a stack. For more details, see Creating an IAM Role in the MarkLogic Server on Amazon Web Services (AWS) Guide.