Loading TOC...
SQL Data Modeling Guide (PDF)

SQL Data Modeling Guide — Chapter 1

SQL on MarkLogic Server

The views module is used to create and manage SQL schemas and views.

The main topics in this chapter are:

Terms Used in this Guide

The following are the definitions for the terms used in this guide:

  • A view is a representation of a SQL view. A view is implemented as an XML document in the schemas database and consists of a unique id, a name (which must be unique in the context of a particular schema), a view scope, and a sequence of column specifications. A view may also include fields that allow you to do more precise queries with the MATCH operator, as demonstrated in Using MLSQL and MATCH Operator.
  • A schema is a representation of a SQL schema. A schema is implemented as an XML document in the schemas database and consists of a unique id, a name (which must also be unique), and a collection of views. During SQL execution, the schema provides the naming context for its views, which enables you to have multiple views of the same name in different schemas. The default schema is called 'main.' It is default is the sense that it is always implicitly available and first on the default schema search path for name resolution in SQL. Even though the 'main' schema is a default, you must create this schema.
  • A column in a view has a name and a range index reference that identifies a particular document element or attribute. The range index for each column must be created before creating the view.
  • A view scope is used to constrain the subset of the database to which the view applies. A view scope can either limit rows in the view to documents with a specific element (localname + namespace) or to documents in a particular collection.

    You must have the view-admin role to execute the functions in the View library.

How SQL is Implemented

The SQL supported by the core SQL engine is SQL92 as implemented in SQLITE with the addition of SET, SHOW, and DESCRIBE statements.

For details on SQLITE, see: http://sqlite.org/index.html.

Schemas and Views

Schemas and views are the main SQL data-modeling components used to represent content stored in a MarkLogic Server database to SQL clients. Schemas and views are created in memory from schema and view specifications, which are XML documents stored on MarkLogic Server in the Schemas database in a protected collection.

A schema is a naming context for a set of views and user access to each schema can be controlled with a different set of permissions. Each view in a schema must have a unique name. However, you can have multiple views of the same name in different schemas. For example, you can have three views, named ‘Songs,' each in a different schema with different protection settings.

A view is a virtual read-only table that represents data stored in a MarkLogic Server database. Each column in a view is based on a range index in the content database, as described in Columns and Range Indexes. User access to each view is controlled by a set of permissions.

Each view has a specific scope that defines the documents from which it reads the column data. The view scope constrains the view to a specific element in the documents (localname + namespace) or to documents in a particular collection. The figure below shows a schema called ‘main' that contains four views, each with a different view scope. The view 'My Songs' is constrained to documents that have a song element in the my namespace; the view 'Your Songs' is constrained to documents that have a song element in the your namespace; the view 'Songs' is constrained to documents that are in the http://view/songs collection, and the view 'Names' is constrained to documents that have a name element in the my namespace.

You can set the scope of a view to any element in the documents, whether it is the document root element or a descendant of the root element.

As described above, schemas and views are stored as documents in the schema database associated with the content database for which they are defined. The default schema database is named ‘Schemas.' If multiple content databases share a single schema database, each content database will have access to all of the views in the schema database.

For example, in the figure below, you have two content databases, Database A and Database B, that both make use of the Schemas database. In this example, you create a single schema, named ‘main,' that contains two views, View1 and View2, on Database A. You then create two views, View3 and View4, on Database 3 and place them into the ‘main' schema. In this situation, both Database A and Database B will each have access to all four views in the ‘main' schema.

The range indexes that back the columns defined in views 1, 2, 3, and 4 have to be defined in both content databases A and B for the views to work. You will get a runtime error if you attempt to use a view that contains a column based on a non-existent range index.

A more 'relational' configuration is to assign a separate schema database to each content database. In the figure below, Database A and Database B each have a separate schema database, SchemaA and SchemaB, respectively. In this example, you create a ‘main' schema for each content database, each of which contains the views to be used for its respective content database.

Representation of SQL Components in MarkLogic Server

This section provides an overview of mapping from MarkLogic Server to SQL The table below lists SQL components and how each is represented in MarkLogic Server:

SQLMarkLogicConfiguration
ColumnA value in range indexcolumn spec
RowA sequence of range index values over the same documentview spec columns
Table/viewA document element or collection of documentsview spec
DatabaseA logical databaseschema spec

There are two basic approaches for representing document data stored in MarkLogic Server:

  • Configure range indexes and views so that you can execute SQL queries on unstructured document data. An example of this approach is provided in Data Modeling Example.
  • Make your document data more structured and relational so that SQL queries behave the way they would in an relational database. An example of this approach is provided in SQL on MarkLogic Server Quick Start.

Columns and Range Indexes

Each column in a view is based on a range index in the content database. Range indexes are described in the Range Indexes and Lexicons chapter in the Administrator's Guide. This section provides examples of what type of range index you might use to store your column data.

Consider a document of the following form:

<book>
   <title subject="oceanography">Sea Creatures</title>
   <pubyear>2011</pubyear>
   <keyword>science</keyword>
   <author>
       <name>Jane Smith</name>
       <university>Wossamotta U</university>
   </author>
   <body>
       <name type="cephalopod">Squid</name>
           Fascinating squid facts...
       <name type="scombridae">Tuna</name>
           Fascinating tuna facts...
       <name type="echinoderm">Starfish</name>
           Fascinating starfish facts...
   </body>
</book>

You can create columns based on an element range indexes for the title, pubyear, keyword, author, and university elements without violating any of the 'relational behavior' rules listed in Guidelines for Relational Behavior. Creating a column based on an element range index for the name element would violate the relational rules. However, you could use a path range index to create a column for the /book/author/name element without violating the relational rules. You might also want to create a column based on an attribute range index for the subject attribute in the title element.

You may chose to model your data so that it is not truly relational. In this case, you could create columns based on a path range index for the book/body/name element and book/body/name/@type attribute.

Searchable Fields

In MarkLogic Server, fields allow you to narrow searches to specific elements. Like the range indexes, fields can be bound to a view. Fields allow you to do more precise queries with the MATCH operator, as demonstrated in SQL on MarkLogic Server Quick Start.

« Table of contents
Next chapter »
Powered by MarkLogic Server 7.0-4.1 and rundmc | Terms of Use | Privacy Policy