MarkLogic allows you to define a template view that specifies which parts of the document make up a row in a view, and then query that view from a server-side program with xdmp:sql, mlsql
, or ODBC. You can also query that view server-side from the MarkLogic Optic API, which is a fluent JavaScript and XQuery interface with the ability to perform joins and aggregates on views over documents. Template views are a simple, powerful way to specify a relational lens over documents, making parts of your document data accessible via SQL. The Optic API gives developers idiomatic JavaScript and XQuery access to relational operations over rows, combined with rich document search. The Optic API is described in the Optic API for Multi-Model Data Access chapter in the Application Developer's Guide.
This chapter describes how to configure MarkLogic Server and create template views to model your MarkLogic data for access by SQL. Template views can also be created using the TDE API described in MarkLogic XQuery and XSLT Function Reference.
The focus of this chapter is on the template elements that are specific to creating views. The Template Driven Extraction (TDE) chapter in the Application Developer's Guide describes the template elements that are common to all types of data-extraction templates.
This chapter contains the following topics:
A template view contains the following elements and their child elements:
Element | Description |
---|---|
|
Optional description of the template. |
collections collection collections-and collection |
Optional collection scopes. Multiple collection scopes can be ORed or ANDed. For details, see Collections in the Application Developer's Guide. |
directories directory |
Optional directory scopes. Multiple directory scopes are ORed together. For details, see Directories in the Application Developer's Guide. |
vars var |
Optional intermediate variables extracted at the current context level. For details, see Variables in the Application Developer's Guide. |
rows row schema-name view-name view-layout sparse identical columns column name scalar-type val nullable default invalid-values ignore reject reindexing hidden visible collation |
scalar-type is the type for the |
templates template |
Optional sequence of sub-templates. Creating Views from Multiple Templates and Creating Views from Nested Templates. You cannot use tde:template-insert to insert multiple templates created by means of the |
path-namespaces path-namespace |
Optional sequence of namespace bindings. For details, see path-namespaces in the Application Developer's Guide.. |
context |
The lookup node that is used for template activation and data extraction. For details, see Context in the Application Developer's Guide. |
enabled |
A boolean that specifies whether the template is enabled (true ) or disabled (false ). Default value is true . |
The context
, vars
, and columns
identify XQuery elements or JSON properties by means of path expressions. Path expressions are based on XPath, which is described in XPath Quick Reference in the XQuery and XSLT Reference Guide and Traversing JSON Documents Using XPath in the Application Developer's Guide.
schema-name
to which the view belongs.view-name
specifies the target view. Extracted rows under a row
section are added to its target view. Multiple templates can reference the same target view.view-layout
element:identical
(default), the view declaration must be consistent between templates; same column names, column data types, and column nullability. sparse
is useful when you use more than one template to define a view. For example, you may want a view that has multiple contexts and an optional column that matches some, but not all, of the documents in the database.target view declaration can have other nullable columns not listed under the current row
. For example, if a view is referenced in template T1 using columns (A,B,C) and in template T2 using columns (A,B,D), the resulting view will have all 4 columns (A,B,C,D). Column A and B are present in both T1 and T2 and can be declared as non nullable. However, columns C and D must be nullable. For an example, see Creating Views from Multiple Templates.name
, data type (scalar-type
) and data mapping (val
). See Columns. The scalar-type is the type for the val
. See Type Casting in the Application Developer's Guide.<row> <schema-name>main</schema-name> <view-name>expenses</view-name> <columns> <column> <name>EmployeeID</name> <scalar-type>int</scalar-type> <val>EmployeeID</val> </column> <column> <name>Date</name> <scalar-type>date</scalar-type> <val>Date</val> </column> <column> <name>Amount</name> <scalar-type>decimal</scalar-type> <val>Amount</val> </column> </columns> </row>
name
. A column is uniquely identifiable by its schema, view, and column names.val
element. The simplest form of projection is a child node under the current context, like EmployeeID in the example above <val>
EmployeeID</val>
. See Template Dialect and Data Transformation Functions in the Application Developer's Guide for the types of expressions allowed in a val
.scalar-type
. The result of the val
expression is automatically casted to the specified scalar type. Users do not have to explicitly create the result in the target datatype. See Type Casting in the Application Developer's Guide.<nullable>true</nullable>
to the corresponding column element. You can specify a default value for a column by adding <default>value</default>.
A null value will be replaced by the default value. reindexing
element that controls whether a column is visible or hidden while being reindexed:invalid-values
element that controls the behavior when cell values cannot be coerced to their datatype:The following table describes the results from the possible combinations of ignore
and reject
on nullable
and non-nullable (<nullable>false</nullable>
) columns. The Default Value column specifies whether or not a default
value is specified for the column. The Invalid Input column describes what happens when the cell value cannot be coerced to the specified datatype. The Missing Input column describes what happens when there is no value available to populate the column.
<column> <name>EmployeeID</name> <scalar-type>int</scalar-type> <val>EmployeeID</val> </column> <column> <name>EmployeeID</name> <scalar-type>int</scalar-type> <val>EmployeeID</val> <nullable>true</nullable> <invalid-values>ignore</invalid-values> </column> <column> <name>Brand</name> <scalar-type>string</scalar-type> <val>Brand</val> <default>generic</default> <invalid-values>reject</invalid-values> </column> <column> <name>SSN</name> <scalar-type>string</scalar-type> <val>id[@root='2.16.840.1.113883.4.1']/@extension</val> </column> <column> <name>Name</name> <scalar-type>string</scalar-type> <val>concat(patient/name/given[1],' ',patient/name/family)</val> </column>
The scope of the view is used to constrain the view to the documents in particular collections or directories. The scope is optional, so do not specify a scope if you elect not to set the scope of the view.
For details on defining a template view scope, see Collections and Directories in the Template Driven Extraction (TDE) chapter in the Application Developer's Guide
The template views described below are written to extract data from documents, like the XML medical document shown below
In XQuery, insert the following document:
let $med := <Citation Status="Completed"> <ID>69152893</ID> <PMID>5717905</PMID> <Article> <Journal> <ISSN>0043-5341</ISSN> <JournalIssue> <Volume>118</Volume> <Issue>49</Issue> <PubDate> <Year>1968</Year> <Month>12</Month> <Day>7</Day> </PubDate> </JournalIssue> </Journal> <ArticleTitle> The Influence of Calcium on Cholesterol in Human Serum </ArticleTitle> <AuthorList> <Author> <LastName>Doe</LastName> <ForeName>John</ForeName> </Author> <Author> <LastName>Smith</LastName> <ForeName>Jane</ForeName> </Author> </AuthorList> </Article> </Citation> return xdmp:document-insert("med1.xml", $med)
In JavaScript, insert the following document:
declareUpdate(); xdmp.documentInsert( "med2.json", { "Journal": { "Issue": 103, "Title": "Bone Density Studies", "Date": "8/1/2009", "Author": "John Simson" }})
This section shows two templates, one in XML and one in JSON, that define a view on the document in Example Documents. The view templates are:
The following XML view template creates a Publications view in the Medical schema.
xquery version "1.0-ml"; import module namespace tde = "http://marklogic.com/xdmp/tde" at "/MarkLogic/tde.xqy"; let $ClinicalView := <template xmlns="http://marklogic.com/xdmp/tde"> <description>populates patients' data</description> <context>/Citation/Article</context> <rows> <row> <schema-name>Medical</schema-name> <view-name>Publications</view-name> <columns> <column> <name>ID</name> <scalar-type>long</scalar-type> <val>../ID</val> </column> <column> <name>ISSN</name> <scalar-type>string</scalar-type> <val>Journal/ISSN</val> </column> <column> <name>Volume</name> <scalar-type>string</scalar-type> <val>Journal/JournalIssue/Volume</val> <nullable>true</nullable> </column> <column> <name>Date</name> <scalar-type>string</scalar-type> <val>Journal/JournalIssue/PubDate/Year||'-' ||Journal/JournalIssue/PubDate/Month||'-' ||Journal/JournalIssue/PubDate/Day</val> <nullable>true</nullable> </column> </columns> </row> </rows> </template> return tde:template-insert("Template.xml", $ClinicalView)
The following JSON view template creates a Publications view in the Medical schema.
declareUpdate(); var tde = require("/MarkLogic/tde.xqy"); var ClinicalView = xdmp.toJSON( { "template":{ "context":"/Citation/Article", "rows":[ { "schemaName":"Medical", "viewName":"Publications", "columns":[ { "name":"ID", "scalarType":"long", "val":"../ID" }, { "name":"ISSN", "scalarType":"string", "val":"Journal/ISSN" }, { "name":"Volume", "scalarType":"string", "val":"Journal/JournalIssue/Volume" }, { "name":"Date", "scalarType":"string", "val":"Journal/JournalIssue/PubDate/Year||'-' \ ||Journal/JournalIssue/PubDate/Month||'-' \ ||Journal/JournalIssue/PubDate/Day" } ] } ] } } ); tde.templateInsert("Template.json", ClinicalView);
You can create a single view from multiple templates. For example, if you want to create a view to support more than one context or scope. The templates below create a Publications view with columns that are scoped for the two different documents shown in Example Documents. The result is that a single query on the Publications view will populate the relevant columns from each document. For example, the 'Title' column will be populated with the value of the <ArticleTitle>
element in the med1.xml
file and the Title
property in the med2.json
file.
There are security implications when a view is created from multiple templates. If none of the templates grant a user access to the view, querying the view results in an unknown table
error, as expected. However, if a user is granted permission to a view by at least one template, the permissions set by the other templates are ignored and the user is permitted to see the SQL values defined by all templates of the view.
Insert the first version of the Publications view template as follows:
xquery version "1.0-ml"; import module namespace tde = "http://marklogic.com/xdmp/tde" at "/MarkLogic/tde.xqy"; let $ClinicalView := <template xmlns="http://marklogic.com/xdmp/tde"> <description>populates patients' data</description> <context>/Citation/Article</context> <rows> <row> <schema-name>Medical</schema-name> <view-name>Publications</view-name> <view-layout>sparse</view-layout> <columns> <column> <name>ISSN</name> <scalar-type>string</scalar-type> <val>Journal/ISSN</val> <nullable>true</nullable> </column> <column> <name>Title</name> <scalar-type>string</scalar-type> <val>ArticleTitle</val> </column> <column> <name>Volume</name> <scalar-type>string</scalar-type> <val>Journal/JournalIssue/Volume</val> </column> <column> <name>Date</name> <scalar-type>string</scalar-type> <val>Journal/JournalIssue/PubDate/Month||'/' ||Journal/JournalIssue/PubDate/Day||'/' ||Journal/JournalIssue/PubDate/Year</val> <nullable>true</nullable> </column> </columns> </row> </rows> </template> return tde:template-insert("Template.xml", $ClinicalView)
Insert the second version of the Pulications view template as follows:
xquery version "1.0-ml"; import module namespace tde = "http://marklogic.com/xdmp/tde" at "/MarkLogic/tde.xqy"; let $ClinicalView := <template xmlns="http://marklogic.com/xdmp/tde"> <description>populates patients' data</description> <context>/Journal</context> <rows> <row> <schema-name>Medical</schema-name> <view-name>Publications</view-name> <view-layout>sparse</view-layout> <columns> <column> <name>Volume</name> <scalar-type>string</scalar-type> <val>Issue</val> </column> <column> <name>Title</name> <scalar-type>string</scalar-type> <val>Title</val> </column> <column> <name>Date</name> <scalar-type>string</scalar-type> <val>Date</val> <nullable>true</nullable> </column> <column> <name>Author</name> <scalar-type>string</scalar-type> <val>Author</val> <nullable>true</nullable> </column> </columns> </row> </rows> </template> return tde:template-insert("Template2.xml", $ClinicalView)
To see the combined results, enter:
select * FROM Medical.Publications
ISSN | Title | Volume | Date | Author |
---|---|---|---|---|
null | Bone Density Studies | 103 | 8/1/2009 | John Simson |
0043-5341 | The Influence of Calcium on Cholesterol in Human Serum | 118 | 12/7/1968 | null |
You can nest template views. The example in this section, though not based on a credible use case, does show how to nest three template views for medical documents so that each child view is within the context of its parent view. The example also shows how variables can be defined in a parent template and then used in child templates. There is no limit to the nesting of template views.
The context for each nested view is as follows:
View | Context |
---|---|
Publication | /Citation |
JournalIssue | /Citation/Article/Journal/JournalIssue |
PubDate | /Citation/Article/Journal/JournalIssue/PubDate |
<template xmlns="http://marklogic.com/xdmp/tde"> <description>Views of the medical data set</description> <context>/Citation</context> <!-- Variables extracted at the current context level --> <vars> <var> <name>ID</name> <val>./ID</val> </var> <var> <name>Status</name> <val>@Status</val> </var> </vars> <rows> <row> <schema-name>medical</schema-name> <view-name>Publication</view-name> <view-layout>sparse</view-layout> <columns> <column> <name>ArticleTitle</name> <scalar-type>string</scalar-type> <val>Article/ArticleTitle</val> </column> <column> <name>ISSN</name> <scalar-type>string</scalar-type> <val>Article/Journal/ISSN</val> </column> </columns> </row> </rows> <templates> <template> <!-- Nested child template --> <!-- context path relative to the parent context: /Citation --> <context>Article/Journal/JournalIssue</context> <rows> <row> <schema-name>medical</schema-name> <view-name>JournalIssue</view-name> <view-layout>sparse</view-layout> <columns> <column> <name>MedID</name> <scalar-type>long</scalar-type> <val>$ID</val> <!-- referencing context var ID --> </column> <column> <name>Volume</name> <scalar-type>long</scalar-type> <val>Volume</val> </column> <column> <name>Issue</name> <scalar-type>long</scalar-type> <val>Issue</val> <nullable>true</nullable> </column> </columns> </row> </rows> <templates> <template> <!-- Nested child template --> <!-- context path relative to the parent context: /Article/Journal/JournalIssue --> <context>PubDate</context> <rows> <row> <schema-name>medical</schema-name> <view-name>PubDate</view-name> <view-layout>sparse</view-layout> <columns> <column> <name>Status</name> <scalar-type>string</scalar-type> <val>$Status</val> <!-- referencing context var Status --> </column> <column> <name>Year</name> <scalar-type>long</scalar-type> <val>Year</val> </column> <column> <name>Month</name> <scalar-type>string</scalar-type> <val>Month</val> </column> <column> <name>Day</name> <scalar-type>long</scalar-type> <val>Day</val> </column> </columns> </row> </rows> </template> </templates> </template> </templates> </template>
As described in Template Driven Extraction (TDE) in the Application Developer's Guide, inserting, enabling, or disabling a template will trigger a reindexing operation on the database. You can add a reindexing
element to a column
definition to control whether that column is visible during a reindex operation. By default, reindexing
is set to hidden
.
This section describes when views and columns can be queried under various circumstances during database reindex operation. TDE is designed to provide view availability during relational schema changes, such as adding or removing columns. In general, if a template change implies that the data in a view/column(s) needs to be refreshed (updated, removed, or added), the view/column will not be available for query during the related reindexing. The columns affected by the template change are taken offline, which means that they are not available for query. A query running on a column that is temporarily unavailable will return a SQL-NOCOLUMN
error. If all the columns in a view are affected by template changes, the entire view is not available for query. In this case, a SQL-TABLENOTFOUND
error is returned if the view is queried. The availability of columns may differ depending on whether you are using single or multiple templates to reference the same view.
Below are some scenarios that illustrate what can be queried during reindexing.
<val>
, <scalar-type>
, <collation>
, <nullability>
, or <invalid-values>
takes the modified column offline. Other columns can be queried.<triples>
section has no effect on any view referenced by the template.When a single template references a view, the following describes the availability of columns under a context
that matches a document during a reindex operation:
When a multiple templates reference a view, the following describes the availability of columns under a context
that matches a document during a reindex operation:
Modify one template with a context
that matches a document:
column
-- column is not available until indexing of the column has completed.column
-- takes the column offline.column
-- takes the column offline.context
-- takes the entire view offline.If you disable one template, the extracted columns will be offline until clean up has completed.