MarkLogic Server allows you to group multiple databases into a super-database in order to allow a single query to be done across multiple databases. Databases contained in a super-database are called sub-databases. Sub-databases can be distributed on different storage tiers and on different clusters (collectively called super-clusters). A sub-database can be either active
(online) or archive
(offline), as specified by the kind
element.
This chapter contains the following topics:
Updates are made on the sub-databases and they made visible for read in the super-database. Below is an illustration of a super-database and its sub-databases configured on a single cluster.
Below is a super-database configured with sub-databases on different clusters. The cluster hosting the super-database must be coupled with the foreign clusters hosting the sub-databases. For details on how to couple clusters, see Coupling Clusters in the Administrator's Guide.
Each foreign cluster should have multiple bootstrap hosts, so that, if a one bootstrap host does down, the super database can use the other bootstrap host to query the sub-databases on that cluster.
The following describes the characteristics of super-databases and sub-databases:
request-timestamp
moves past the commit timestamp of the insert. Typically, this takes a few seconds.You can call the POST:/manage/v2/databases
resource address to create a super-database. To create a super-database, simply specify which databases are to be its sub-databases.
For example, to define the mySuperDatabase database as a super-database containing the subDB1
, subDB2
, and subDB3
sub-databases on the same cluster, do the following:
$ curl --anyauth --user user:password -X POST \ -d'{"database-name": "mySuperDatabase", "subdatabases": [ "subdatabase"{"cluster-name":"localhost", "database-name":"subDB1"}, "subdatabase"{"cluster-name":"localhost", "database-name":"subDB2"}, "subdatabase"{"cluster-name":"localhost", "database-name":"subDB3"}] }' -H 'Content-type: application/json' \ http://MyHost:8002/manage/v2/databases
Before creating a super-cluster, you must couple the clusters as described in Coupling Clusters in the Administrator's Guide.
For example, to define the mySuperCluster database as a super-cluster containing the subDB1
, subDB2
, and subDB3
sub-databases on different clusters, do the following:
$ curl --anyauth --user user:password -X POST \ -d'{"database-name": "mySuperCluster", "subdatabases": [ "subdatabase"{"cluster-name":"cluster1", "database-name":"subDB1"}, "subdatabase"{"cluster-name":"cluster2", "database-name":"subDB2"}, "subdatabase"{"cluster-name":"cluster3", "database-name":"subDB3"}] }' -H 'Content-type: application/json' \ http://MyHost:8002/manage/v2/databases
You can call the GET:/manage/v2/databases/{id|name}/super-databases
resource address to return a list of the super-databases associated with a sub-database. For example, to view the super-databases of the subdb1
database, do the following:
$ curl --anyauth --user user:password -X GET \ -H 'Content-type: application/xml' \ http://MyHost:8002/manage/v2/databases/subdb1/super-databases
You can call the GET:/manage/v2/databases/{id|name}/sub-databases
resource address to return a list of the sub-databases associated with a super-database. For example, to view the sub-databases of the superdb1
database, do the following:
$ curl --anyauth --user user:password -X GET \ -H 'Content-type: application/xml' \ http://MyHost:8002/manage/v2/databases/superdb1/sub-databases
Since updates can happen at both the super-database and the sub-database level, duplicate URIs are more likely in super-databases. Some automatically generated URIs may produce duplicates at the super-database level. This is true not only for automatically-generated URIs for graph documents, but also may be a problem for the bitemporal LSQT documents, and for directory properties fragments created with automatic-directory-creation. Duplicate URIs will generate a DUPURI
exception.