Loading TOC...
Search Developer's Guide (PDF)

Search Developer's Guide — Chapter 14

Geospatial Search Applications

This chapter describes how to use the geospatial functions and describes the type of applications that might use these functions, and includes the following sections:

Overview of Geospatial Data in MarkLogic Server

In its most basic form, geospatial data is a set of latitude and longitude coordinates. Geospatial data in MarkLogic Server is marked up in XML elements and/or attributes and JSON properties. MarkLogic Server supports several common representations of geospatial data in XML and JSON. This section provides an overview of how geospatial data and queries work in MarkLogic Server, and includes the following topics:

Terminology

The following terms are used to describe the geospatial features in MarkLogic Server:

  • coordinate system

    A geospatial coordinate system is a set of mappings that map places on Earth to a set of numbers. The vertical axis is represented by a latitude coordinate, and the horizontal axis is represented by a longitude coordinate, and together they make up a coordinate system that is used to map places on the Earth. For more details, see Latitude and Longitude Coordinates in MarkLogic Server.

  • point

    A geospatial point is the spot in the geospatial coordinate system representing the intersection of a given latitude and longitude. For more details, see Points in MarkLogic Server.

  • proximity

    The proximity of search results is how close the results are to each other in a document. Proximity can apply to any type of search terms, included geospatial search terms. For example, you might want to find a search term dog that occurs within 10 words of a point in a given zip code.

  • distance

    The distance between two geospatial objects refers to the geographical closeness of those geospatial objects.

Coordinate System

MarkLogic Server supports two types of coordinate systems for geospatial data:

  • WGS84
  • Raw

By default, MarkLogic Server uses the World Geodetic System (WGS84) as the basis for geocoding. WGS84 sets out a coordinate system that assumes a single map projection of the earth. WGS84 is widely used for mapping locations on the earth, and is used by a wide range of services, including many satellite services (notably: Global Positioning System--GPS) and Google Maps. There are other geocoding systems, some of which have advantages or disadvantages over WGS84 (for example, some are more accurate in a given region, some are less popular); MarkLogic Server uses WGS84, which is a widely accepted standard for global point representation. For details on WGS84, see http://en.wikipedia.org/wiki/World_Geodetic_System.

You can use the raw coordinate system when you want your points mapped onto a flat plane instead of onto the geometry of the earth.

Types of Geospatial Queries

You can use the search capabilities of MarkLogic to find documents containing points that match points or regions specified as input search criteria. The following types of geospatial queries are supported in MarkLogic Server:

  • point query--matches a single point
  • box query--any point within a rectangular box
  • radius query--any point within a specified distance around a point
  • polygon query--any point within a specified n-sided polygon

Geospatial query constructors are composable just like any other query constructors. For details, see Composing cts:query Expressions.

In addition to geospatial query constructors, MarkLogic provides built-in functions to perform geospatial operations such as calculating distance. For details, see Summary of Other Geospatial Operations.

Using the geospatial query constructors requires a valid geospatial license key; without a valid license key, searches that include geospatial queries will throw an exception.

XQuery Primitive Types And Constructors for Geospatial Queries

Geospatial queries and other operations are based on the following primitive geospatial types. Each of these geospatial primitive types is an instance of the cts:region base type.

XQuery JavaScript
cts:point
cts.point
cts:circle
cts.circle
cts:box
cts.box
cts:polygon
cts.polygon
cts:complex-polygon
cts.complexPolygon
cts:linestring
cts.linestring

For each primitive type, a constructor of the same name exists for constructing values of that type. The constructors accept either the raw data, such as a pair of float values for constructing a point, or a string representing the serialization of the underlying primitive type.

For example, the following constructs a point from the raw data:

XQuery JavaScript
cts:point(38.7, -10.3)
cts.point(38.7, -10.3)

For details, see Constructing Geospatial Point and Region Values.

You use these primitive types in geospatial cts:query constructors such as cts:element-geospatial-query and cts:json-property-geospatial-query, or their JavaScript equivalents. A geospatial query matches if the regions contain matching data in the context of a search. For details, see Performing a GeoSpatial Query.

These types are also used in geospatial operations such as the XQuery functions geo:distance, geo:circle-intersects, and geo:polygon-contains, or the JavaScript functions geo.distance, geo.circleIntersects, and geo.polygonContains. For details, see Summary of Other Geospatial Operations.

Well-Known Text (WKT) and Well-Known Binary (WKB) Support

MarkLogic supports the well-known text (WKT) and well-known binary (WKB) representations of geospatial data. You can use the following WKT and WKB objects in MarkLogic: POINT, POLYGON, LINESTRING, TRIANGLE, MULTIPOINT, MULTILINESTRING, MULTIPOLYGON, and GEOMETRYCOLLECTION.

For example, the following code uses ge.parseWkt to construct a polygon from its WKT representation:

Language Example
XQuery
geo:parse-wkt(
  'POLYGON((0 0, 0 10, 10 10, 10 0, 0 0),(0 5, 0 7, 5 7, 5 5, 0 5))'
)
JavaScript
geo.parseWkt(
  'POLYGON((0 0, 0 10, 10 10, 10 0, 0 0),(0 5, 0 7, 5 7, 5 5, 0 5))'
)

For details, see Converting To and From Common Geospatial Representations.

Understanding Geospatial Coordinates and Regions

This section describes the rules for geospatial coordinates and the various regions (cts:box, cts:circle, cts:complex-polygon, cts:linestring, cts:point, and cts:polygon), and includes the following topics:

Understanding the Basics of Coordinates and Points

To understand how geospatial regions are defined in MarkLogic Server, you should first understand the basics of coordinates and of points. This section describes the following:

Latitude and Longitude Coordinates in MarkLogic Server

Latitudes have north/south coordinates. They start at 0 degrees for the equator and head north to 90 degrees for the north pole and south to -90 degrees for the south pole. If you specify a coordinate that is greater than 90 degrees or less than -90 degrees, the value is truncated to either 90 or -90, respectively.

Longitudes have east/west coordinates. They start at 0 degrees at the Prime Meridian and head east around the Earth to 180 degrees, and head west around the earth (from the Prime Meridian) to -180 degrees. If you travel 360 degrees, it brings you back to the Prime Meridian. If you go west from the Prime Meridian, the numbers go negative. For example, New York City is west of the Prime Meridian, and its longitude is -73.99 degrees. Adding or subtracting any multiple of 360 to a longitude coordinate gives an equivalent coordinate.

Points in MarkLogic Server

A point is simply a pair of latitude and longitude coordinates. Where the coordinates intersect is a place on the Earth. For example, the coordinates of San Francisco, California are a the pair that includes the latitude of 37.655983 and the longitude of -122.425525. The cts:point XQuery type or cts.point JavaScript type represent a point in MarkLogic Server. Use the cts:point or cts.point constructor to construct a point from a set of coordinates. Additionally, points are used to define the other regions in MarkLogic Server such cts:box in XQuery or cts.polygon in JavaScript.

Understanding Geospatial Boxes

Geospatial boxes allow you to make a region defined by four coordinates. The four coordinates define a geospatial box which, when projected onto a flat plane, forms a rectangular box. A point is said to be in that geospatial box if it is inside the boundaries of the box. The four coordinates that define a box represent the southern, western, northern, and eastern boundaries of the box. The box is two-dimensional, and is created by taking a projection from the three-dimensional Earth onto a flat surface. On the surface of the Earth, the edges of the box are arcs, but when those arcs are projected into a plane, they become two-dimensional latitude and longitude lines, and the space defined by those lines forms a rectangle (represented by a cts:box in XQuery or a cts.box in JavaScript), as shown in the following figure.

The following are the assumptions and restrictions associated with geospatial boxes:

  • The four points on a box are south, west, north, and east, in that order.
  • Assuming a projection from the Earth onto a two-dimensional plane, boxes are determined by going from the south western limit to south eastern limit (even if it passes the date line), then north to the north eastern limit (border on the poles), then west to the north western limit, then back south to the south western limit where you started.
  • When determining the west/east boundary of the box, you always start at the western longitude and head east toward the eastern longitude. This means that if your western point is east of the date line, and your eastern point is west of the date line, then you will head east around the Earth until you get back to the eastern point.
  • Similarly, when determining the south/north sides of the box, you always start at the southern latitude and head north to the northern latitude. You cannot cross the pole, however, as it does not make sense to have the northern point south of the southern point. If you do cross a pole, a search that uses that box will throw an XDMP-BADBOX runtime error (because you cannot go north from the north pole). Note that the error will happen at search time, not at box creation time.
  • If the eastern coordinate is equal to the western coordinate, then only that longitude is considered. Similarly, if the northern coordinate is equal to the southern coordinate, only that latitude is considered. The consequence of these facts are the following:
    • If the western and eastern coordinates are the same, the box is a vertical line between the southern and northern coordinates passing through that longitude coordinate.
    • If the southern and northern coordinates are the same, the box is a horizontal line between the western and eastern coordinates passing through that latitude coordinate.
    • If the western and eastern coordinates are the same, and if the southern and northern coordinates are the same, then the box is a point specified by those coordinates.
  • The boundaries on the box are either in or out of the box, depending on query options (there are various boundary options on the geospatial cts:query constructors to control this behavior).

Understanding Geospatial Polygons: Polygons, Complex Polygons, and Linestrings

Geospatial polygons allow you to make a region with n-sided boundaries for your geospatial queries. These boundaries can represent any area on Earth (with the exceptions described below). For example, you might create a polygon to represent a country or a geographical region. There are three ways to construct these types of geospatial regions in MarkLogic: polygons, complex polygons, and linestrings. This section describes some of the charateristics of polygons, and includes the following topics:

Overview of Polygons

Polygons offer a large degree of flexibility compared with circles or boxes. In exchange for the flexibility, geospatial polygons are not quite as fast and not quite as accurate as geospatial boxes. The efficiency of the polygons is proportional to the number of sides to the polygon. For example, a typical 10-sided polygon will likely perform faster than a typical 1000-sided polygon. The speed is dependent on many factors, including where the polygon is, the nature of your geospatial data, and so on.

The following are the assumptions and restrictions associated with geospatial polygons:

  • Assumes the Earth is a sphere, divided by great circle arcs running through the center of the earth, one great circle divided the longitude (running through the Greenwich Meridian, sometimes called the Prime Meridian) and the other dividing the latitude (at the equator).
  • Each side of the polygons are semi-spherical projections from the endpoints onto the spherical Earth surface. Therefore, the lines are not all in a single plane, but instead follow the curve of the Earth (approximated to be a sphere).
  • A polygon cannot include both poles. Therefore, it cannot have both poles as a boundary (regardless of whether the boundaries are included), which means it cannot encompass the full 180 degrees of latitude.
  • A polygon edge must be less than 180 degrees; that is, two adjacent points of a polygon must wrap around less than half of the earth's longitude or latitude. If you need a polygon to wrap around more than 180 degrees, you can still do it, but you must use more than two points. Therefore, adjacent vertices cannot be separated by more than 180 degrees of longitude. As a result, a polygon cannot include the pole, except along one of its edges. Also as a result, if two points that make up a polygon edge are greater than 180 degrees apart, MarkLogic Server will always choose the direction that is less than 180 degrees.
  • Geospatial queries are constrained to XML elements, XML attributes, and JSON properties named in the query constructors. To cross multiple formats in a single query, use cts:or-query in XQuery or cts.orQuery in JavaScript.
  • Some searches will throw a runtime exception if a polygon is not valid for the coordinate system. You specify the coordinate system at search time, not at polygon construction time.
  • The boundaries on the polygon are either in or out of the polygon, depending on query options. You can set a variety of boundary options to control this behavior when constructing geospatial queries.
  • Because of the spherical Earth assumption, and because points are represented by floats, results are not exact; polygons are not as accurate as the other methods because they use a sphere as a model of the Earth. While it may not be that intuitive, floats are used to represent points on the Earth because it turns out that there is no benefit in the accuracy if you use doubles (the Earth is just not that big).
Polygons

You can construct a polygon by specifying the points that make up the vertices of the polygon. All points that are bounded by the resulting region are defined to be contained within the region.

For details, see the cts:polygon XQuery function or the cts.polygon JavaScript function.

Complex Polygons

You can construct a complex polygon by constructing a polygon within zero or more other polygons. The resulting complex polygon is the part within the outer polygon but not within the inner polygon(s). Use the cts:complex-polygon XQuery function or the cts.complexPolygon JavaScript function to construct a complex polygon.

You can also cast a cts:complex-polygon or cts.complexPolygon with no holes (that is, with no inner polygons) to a cts:polygon or cts.polygon. If you specify multiple inner polygons, none of them should overlap each other.

Linestrings

A linestring is a sequence of connected joined arcs that do not necessarily form a closed loop the way a polygon forms a closed loop (although it is permissible for a linestring to form a closed loop). The 'lines' are actually arcs because they are projected onto the earth's surface. A linestring supports equality and inequality: two linestrings are equal if all of their verticies are equal (or if they are both empty). You can cast a linestring to a polygion, resulting in a 'flat' polygon that traces the same set of linestrings back to close the polygon.

To construct a linestring, use the cts:linestring XQuery function or the cts.linestring JavaScript function.

Understanding Geospatial Circles

Geospatial circles allow you to define a region with boundaries defined by a point with a radius specified in miles. The point and radius define a circle, and anything inside the circle is within the boundaries of the circle. The circle boundaries are either in or out, depending on query options. You can set a variety of boundary options to control this behavior when constructing geospatial queries.

To construct a circle, use the cts:circle XQuery function or the cts.circle JavaScript function.

Geospatial Indexes

Because you can store geospatial data as XML or JSON within a document, you can query the content constraining on the geospatial XML or JSON markup. You can create geospatial indexes to speed up geospatial queries and to enable geospatial lexicon queries, allowing you to take full advantage of having the geospatial data in your content. This section describes the different kinds of geospatial indexes and includes the following parts:

Different Kinds of Geospatial Indexes

This section describes the types of geospatial indexes and the structure of geospatial data assumed by each index type. You use the same index interfaces for both JSON and XML, but the structure of the data is different, as described in this section.

Use the Admin Interface to create any of these indexes, under Database > database_name > Geospatial Indexes. To learn more about creating indexes, see Range Indexes and Lexicons in the Administrator's Guide.

This section covers the following topics:

Geospatial Element Indexes

With a geospatial element index, the geospatial data is represented by whitespace or punctuation (except +, -, or .) separated element content:

<element-name>37.52  -122.25</element-name>

For point format, the first entry represents the latitude coordinate, and the second entry represents the longitude coordinate. For long-lat-point format, the first entry represents the longitude coordinate and the second entry represents the latitude coordinate. You can also have other entries, but they are ignored (for example, KML has an additional altitude coordinate, which can be present but is ignored).

For JSON data requirements, see Geospatial JSON Property Indexes.

Geospatial XML Element Child Indexes

With a geospatial element child index, the geospatial data comes from whitespace or punctuation (except +, -, or .) separated element content, but only for elements that are a specific child of a specific element.

<element-name1>
  <element-name2>37.52  -122.25</element-name2>
</element-name1>

For point format, the first entry represents the latitude coordinate, and the second entry represents the longitude coordinate. For long-lat-point format, the first entry represents the longitude coordinate and the second entry represents the latitude coordinate.

For JSON data requirements, see Geospatial JSON Property Child Indexes.

Geospatial XML Element Pair Indexes

With a geospatial element pair index, the geospatial data comes from a specific pair of elements that are a child of another specific element.

<element-name>
  <latitude>37.52</latitude>
  <longitude>-122.25</longitude>
</element-name1>

For JSON data requirements, see Geospatial JSON Property Pair Indexes.

Geospatial XML Attribute Pair Indexes

With a geospatial attribute pair index, the geospatial data comes from a pair of specific attributes of a specific element.

<element-name latitude="37.52" longitude="-122.25"/>
Geospatial Path Range Indexes

With a geospatial path range index, the geospatial data is expressed in the same manner as a geospatial element index and the XML element, XML attribute, or JSON property to index is defined by a path expression.

The following table demonstrates the XPath expression to use when creating a path range index for several forms of example geospatial data.

Format Example Data Indexing Path Expression
XML
<a:data>
  <a:geo>37.52  -122.25</a:geo>
</a:data>
/a:data/a:geo
XML
<a:data>
  <a:geo data="37.52  -122.25"/>
</a:data>
/a:data/a:geo/@data
JSON
{ "geometry" : {
    "type": "Point",
    "coordinates": [37.52, -122.25]
  }
}
/geometry[type="Point"]/array-node("coordinates")

Once you have created a geospatial path range index using the Admin Interface, you cannot change the path expression. To change the path, you must remove the existing geospatial path range index and create a new one.

Geospatial JSON Property Indexes

Use a geospatial element index to index geospatial data in JSON documents when the point coordinates are contained in a single JSON property. The geospatial data must be represented in the property value as either whitespace/punctuation separated values in a string, or as an array of values. For example:

"prop-name": "37.52 -122.25"
"prop-name": [37.52, -122.25]

For point format, the first entry represents the latitude coordinate, and the second entry represents the longitude coordinate. For long-lat-point format, the first entry represents the longitude coordinate and the second entry represents the latitude coordinate. The value can include other entries, but they are ignored (for example, KML has an additional altitude coordinate, which can be present but is ignored).

Geospatial JSON Property Child Indexes

Use a geospatial element child index to index geospatial data in JSON when you want to limit the index to coordinate properties contained in a specific property. The geospatial data must be represented in the child property value as either whitespace/punctuation separated values in a string, or as an array of values.

For example, if your data looks like one of the following, you could create a geospatial element child index specifying "theParent" as the parent element (property) and "theChild" as the child element (property).

"theParent": {
  "theChild": "37.52 -122.25"
}
"theParent": {
  "theChild": [37.52, -122.25]
}

For point format, the first entry represents the latitude coordinate, and the second entry represents the longitude coordinate. For long-lat-point format, the first entry represents the longitude coordinate and the second entry represents the latitude coordinate.

Geospatial JSON Property Pair Indexes

Use a geospatial element pair index to index geospatial data in JSON when the point coordinates are contained in sibling JSON properties. For example, use this type of index when working with data similar to the following:

"theParent" : {
  "latitude": 37.52,
  "longitude": -122.25
}

Geospatial Index Positions

Each geospatial index has a range value positions option. Enabling range value positions speeds up queries that constrain a search by the distance between geospatial data and other search terms in the document, such as when using cts:near-query in XQuery or cts.nearQuery in Javascript.

Additionally, enabling element positions improves index resolution (more accurate estimates) for XML element and JSON property queries that involve geospatial queries (with a geospatial index with positions enabled for the geospatial data).

Geospatial Lexicons

Geospatial indexes enable geospatial lexicon lookups. The lexicon lookups enable very fast retrieval of geospatial values. For details on geospatial lexicons, see Geospatial Lexicons.

Performing a GeoSpatial Query

This section provides an overview of the Geospatial API, and includes the following parts:

Basic Procedure for Performing a Geospatial Query

Using the geospatial API is just like using any cts:query constructors, where you use the cts:query as the second parameter (or a building block of the second parameter) of cts:search. The basic procedure involves the following steps:

  1. Load geospatial data into a database.
  2. Create geospatial indexes, if needed.
  3. Construct primitive types to use in geospatial cts:query constructors.
  4. Construct geospatial queries that use the geospatial primitive types.
  5. Use the geospatial queries with the cts:search XQuery function or the cts.search JavaScript function.

Creating appropriate geospatial indexes in Step 2 can improve both speed and accuracy. Indexes are required for certain kinds of queries, such as range queries. Indexes are optional for queries such as value queries, but only if you use unfiltered search, which slows your search down. For details, see Fast Pagination and Unfiltered Searches in the Query Performance and Tuning Guide.

You can also use the non-search geospatial operations, such as the cts:contains XQuery function or the cts.contains JavaScript function. You can use such geospatial functions whether or not the input geospatial data is in the database.

Geospatial Query Constructors

The following geospatial query constructors are available. You can use these constructors with each other and with other cts:query constructors to build up complex queries.

XQuery JavaScript
cts:element-attribute-pair-geospatial-query cts.elementAttributePairGeospatialQuery
cts:element-child-geospatial-query cts.elementChildGeospatialQuery
cts:element-geospatial-query cts.elementGeospatialQuery
cts:element-pair-geospatial-query cts.elementPairGeospatialQuery
cts:json-property-child-geospatial-query cts.jsonPropertyChildGeospatialQuery
cts:json-property-geospatial-query cts.jsonPropertyGeospatialQuery
cts:json-property-pair-geospatial-query cts.jsonPropertyPairGeospatialQuery
cts:path-geospatial-query cts.pathGeospatialQuery

For an example, see Example: Simple XQuery Geospatial Search.

You can also create geospatial queries from query text using the XQuery cts:parse function or the Server-Side JavaScript cts.parse function. For details, see Creating a Query From Search Text With cts:parse.

Geospatial Format Conversion Functions

MarkLogic provides XQuery and JavaScript library modules to translate Metacarta, GML, KML, GeoRSS, and GeoJSON formats to MarkLogic primitive geospatial types such as cts:box, cts:circle, cts:point, and cts:polygon.

The functions in these libraries are designed to take geospatial data in supported formats and construct cts:region primitive types to pass into the geospatial cts:query constructors and other geospatial operations.

For the signatures of these functions, see the Library Module section of the MarkLogic XQuery and XSLT Function Reference or the MarkLogic Server-Side JavaScript Function Reference.

Example: Simple XQuery Geospatial Search

This section provides an example showing a cts:search that uses a geospatial query.

Assume a document with the URI /geo/zip_labels.xml with the following form:

<labels>
  <label id="1">96044</label>
  ...
  <label id="589">95616</label>
  <label id="712">95616</label>
  <label id="715">95616</label>
  ...
</labels>

Assume you have polygon data in a document with the URI /geo/zip.xml with the following form:

<polygon id="712">
       0.383337584506173E+02,       -0.121659014798844E+03
       0.383133840000000E+02,       -0.121656011000000E+03
       0.383135090000000E+02,       -0.121666647000000E+03
       0.383135090000000E+02,       -0.121666647000000E+03
       0.383135120000000E+02,       -0.121666875000000E+03
       0.383349030000000E+02,       -0.121667035000000E+03
       0.383353510000000E+02,       -0.121657355000000E+03
       0.383496550000000E+02,       -0.121656811000000E+03
       0.383495590000000E+02,       -0.121646955000000E+03
       0.383494950000000E+02,       -0.121645323000000E+03
       0.383473190000000E+02,       -0.121645691000000E+03
       0.383370790000000E+02,       -0.121650187000000E+03
       0.383133840000000E+02,       -0.121656011000000E+03
</polygon>

You can then take the contents of the polygon element and cast it to a cts:polygon using the cts:polygon constructor. For example, the following returns a cts:polygon for the above data:

cts:polygon(fn:data(fn:doc("/geo/zip.xml")//polygon[@id eq "712"]))

Further assume you have content of the following form:

<feature id="1703188" class="School">
  <name>Ralph Waldo Emerson Junior High School</name>
  <state id="06">CA</state>
  <county id="113">Yolo</county>
  <lat dms="383306N">38.5515731</lat>
  <long dms="1214639W">-121.7774624</long>
  <elevation>17</elevation>
  <map>Merritt</map>
</feature>

Now consider the following XQuery:

let $searchterms := ("school", "junior")
let $zip := "95616"
let $ziplabel := fn:doc("/geo/zip_labels.xml")//label[contains(.,$zip)]
let $polygons := 
   for $p in fn:doc("/geo/zip.xml")//polygon[@id=$ziplabel/@id]
   return cts:polygon(fn:data($p))
let $query := 
   cts:and-query((
       for $term in $searchterms return cts:word-query($term),
       cts:element-pair-geospatial-query(xs:QName("feature"), 
             xs:QName("lat"), xs:QName("long"), $polygons) ))
return  (
<h2>{fn:concat("Places with the term '", 
               fn:string-join($searchterms, "' and the term '"), 
               "' in the zipcode ", fn:data($zip), ":")}</h2>,
  <ol>{for $feature in cts:search(//feature, $query)
  order by $feature/name
  return (
  <li><h3>{fn:data($feature/name)," "}   
  <code>({fn:data($feature/lat)},{fn:data($feature/long)})</code></h3>
  <p>{fn:data($feature/@class)} in {fn:data($feature/county)},   
  {fn:data($feature/state)} from map {fn:data($feature/map)}</p></li> )
  }</ol> )

This returns results similar to the following (shown rendered in a browser):

Converting To and From Common Geospatial Representations

MarkLogic provides interfaces for converting between MarkLogic geospatial primitive types and several common geospatial text, XML, and JSON representations. This section covers the following topics:

Conversion Overview

You can use MarkLogic APIs to convert to and from the following common geospatial representations:

  • Well-Known Text (WKT)
  • Well-Known Binary (WKB)
  • GML
  • KML
  • GeoJSON
  • GeoRSS

For example, the following XQuery code uses geo:parse-gml to convert a GML region into a cts:region (a polygon in this case). This function determines the output cts:region type from the kind of input GML region.

import module namespace geogml ="http://marklogic.com/geospatial/gml"
  at "/MarkLogic/geospatial/gml.xqy";

geogml:parse-gml(
  <gml:Polygon srsName="ML:wgs84"
      xmlns:gml="http://www.opengis.net/gml/3.2">
    <gml:exterior>
      <gml:LinearRing>
        <gml:posList srsDimension="2">
          5.0 1.0 8.0 1.0 8.0 6.0 5.0 7.0 5.0 1.0
        </gml:posList>
      </gml:LinearRing>
    </gml:exterior>
  </gml:Polygon>
)

If you know the input region type, you can also use one of the region-specific constructors to perform the equivalent conversion. For example, the above code could use geogml:polygon instead of geogml:parse-gml. For details, see Constructing Geospatial Point and Region Values.

The following Server-Side JavaScript code converts a GeoJSON polygon into a cts.polygon:

// Create a cts.polygon from a GeoJSON polygon
var geojson = require('/MarkLogic/geospatial/geojson.xqy');

geojson.polygon(
  { type: 'Polygon', 
    coordinates: [
      [[100.0,0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0]]
  ] }
)

For each format, the XQuery API includes a parse-format function for converting from the common representation to a MarkLogic geospatial primitive type, and the JavaScript API includes a parseFormat function for the same purpose. This operation is equivalent to calling the geo:parse XQuery function or the geo.parse JavaScript function with input of the same format. The API also includes a to-format XQuery function and toFormat JavaScript function for converting from a MarkLogic primitive type to the target format.

For example, the GeoJSON library module includes the following functions that can be used to convert data between GeoJSON and cts:region.

XQuery JavaScript
geojson:parse-geojson geojson.parseGeojson
geojson:to-geojson geojson.toGeojson
geojson:box geojson.box
geojson:circle geojson.circle
geojson:complex-polygon geojson.complexPolygon
geojson:linestring geojson.linestring
geojson:multi-linestring geojson.multiLinestring
geojson:point geojson.point
geojson:polygon geojson.polygon

You can use the built-in geo:parse XQuery function or geo.parse JavaScript function to convert nodes in any of the supported formats into an equivalent MarkLogic geospatial primitive type, without regard to the input format or region type. For best performance, if you know the format, use the equivalent format-specific functions.

WKT and WKB Conversions in XQuery

MarkLogic represents geospatial data using the cts:region type and types derived from it, such as cts:point, cts:polygon, and cts:circle. You can convert from WKT or WKB into cts:region items and from cts:region into WKT or WKB.

Use the geo:parse-wkt function to convert WKT data into a sequence of cts:region items. Similarly, use geo:parse-wkb to convert WKB data into a sequence of cts:region items. You can use the resulting items in geospatial cts:query constructors or geospatial operations.

For example, the following call converts a WKT polygon with an inner and outer boundary into a cts:complex-polygon:

geo:parse-wkt("
 POLYGON(
  (0 0, 0 10, 10 10, 10 0, 0 0),
  (0 5, 0 7, 5 7, 5 5, 0 5) )" )

The input to geo:parse-wkb is a binary node that contains a WKB byte sequence. For example, the following code converts a WKB byte sequence representing the coordinates (-73.700380647, 40.739754168) into a cts:point:

geo:parse-wkb(
  binary { "010100000072675909D36C52C0E151BB43B05E4440" }
)

To convert from cts:region to WKT, use geo:to-wkt. For example, the following code returns a WKT POINT:

geo:to-wkt(cts:point(1, 2))

Similarly, the following code returns a WKB POINT:

geo:to-wkb(cts:point(1, 2))

You cannot convert a cts:circle or a cts:box to WKT. For more details on WKT, see http://en.wikipedia.org/wiki/Well-known_text.

WKT and WKB Conversions in JavaScript

MarkLogic represents geospatial data using the cts.region type and types derived from it, such as cts.point, cts.polygon, and cts.circle. MarkLogic provides the following conversions between WKT or WKB and cts.region: You can use the cts.region representation cts:query constructors or geospatial operations.

  • Explict conversion from WKT or WKB to cts.region using the geo.parseWkt function. For example:
    // Convert WKT polygon into a cts.complexPolygon
    geo.parseWkt(
      'POLYGON((0 0, 0 10, 10 10, 10 0, 0 0),(0 5, 0 7, 5 7, 5 5, 0 5))'
    )
    
    // Convert WKB bye sequence representing the coordinates
    // (-73.700380647, 40.739754168) into a cts.point
    geo.parseWkb(
      new NodeBuilder()
        .addBinary('010100000072675909D36C52C0E151BB43B05E4440')
        .toNode()
    )
  • Explicit converstion from cts.region to WKT or WKB using the geo.toWkt or geo.toWkb functions. For example:
    // cts.point to WKT
    geo.toWkt(cts.point(1, 2))
    
    // cts.point to WKB
    geo.toWkb(cts.point(1, 2))
  • Implicit conversion from WKT to cts.region where the expected type is a cts.region. For example:
    // create a cts.polygon from WKT via implicit conversion
    cts.polygon('POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))')

Note that geo.parseWkt and geo.toWkt return a ValueIterator rather than an array or a single value. The input to geo.parseWkb is a binary node that contains a WKB byte sequence.

The supported conversions from WKT to cts.region mean all the following calls pass the same cts.polygon value to geo.polygonContains, which returns true.:

// Use a cts.polygon created from a set of cts.point values
geo.polygonContains(
  cts.polygon([
    cts.point(30,10), cts.point(40,40), cts.point(20,40),
    cts.point(10,30), cts.point(30,10)]),
  cts.point(25,25))

// Use a cts.polygon created by explicitly converting from WKT
geo.polygonContains(
  geo.parseWkt('POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))'),
  cts.point(25,25))

// Use a cts.polygon created by implicitly converting from WKT
geo.polygonContains(
  cts.polygon('POLYGON ((30 10, 40 40, 20 40, 10 20, 30 10))'),
  cts.point(25,25))

You cannot convert a cts.circle or a cts.box to WKT. For more details on WKT, see http://en.wikipedia.org/wiki/Well-known_text.

Mapping of WKT and WKB Types to MarkLogic Types

The following table shows how the WKT and WKB types map to the MarkLogic geospatial types. That is, the equivalent value type resulting from calling the geo:parse-wkt XQuery function or geo.parseWkt JavaScript function, or the WKB equivalents.

WKT/WKB Geometry MarkLogic XQuery Type MarkLogic JavaScript Type
POINT cts:point cts.point
POINT EMPTY (WKT only) cts:point(flagged as empty) cts.point(flagged as empty)
POLYGON cts:complex-polygon| cts:polygon cts.complexPolygon| cts.polygon
POLYGON EMPTY cts:complex-polygon(flagged as empty) cts.complexPolygon(flagged as empty)
LINESTRING cts:linestring cts.linestring
LINESTRING EMPTY cts:linestring(flagged as empty) cts.linestring(flagged as empty)
TRIANGLE cts:polygon cts.polygon
TRIANGLE EMPTY cts:complex-polygon(flagged as empty) cts.complexPolygon(flagged as empty)
MULTIPOINT cts:point* zero or more cts.point nodes
MULTIPOINT EMPTY () an empty ValueIterator
MULTILINESTRING cts:linestring* zero or more cts.linestring
MULTILINESTRING EMPTY () null, empty array, or empty ValueIterator
MULTIPOLYGON (cts:polygon| cts:complex-polygon)* (cts.polygon| cts.complexPolygon)*
MULTIPOLYGON EMPTY () null, empty array, or empty ValueIterator
GEOMETRYCOLLECTION cts:region* zero or more cts.region nodes
GEOMETRYCOLLECTION EMPTY () null, empty array, or empty ValueIterator
others throws XDMP-BADWKT throws XDMP-BADWKT

Constructing Geospatial Point and Region Values

Use the following APIs to construct geospatial regions. You can use the resulting region values in geospatial queries and other geospatial operations.

XQuery JavaScript
cts:box cts.box
cts:circle cts.circle
cts:complex-polygon cts.complexPolygon
cts:linestring cts.linestring
cts:point cts.point
cts:polygon cts.polygon

These constructors accept either the raw data, such as a pair of float values for constructing a point, or a string representing the serialization of the underlying primitive type. The serialized representation can be either the MarkLogic internal representation, such as a serilaized cts:point, or a WKT serialization. If the primitive is not constructible from the string input, an exception is thrown.

For example, the following call constructs a cts:polygon from a string that is a serialized cts:point value (space-separated points):

XQuery JavaScript
cts:polygon("38,-10 40,-10 39, -15")
cts.polygon('38,-10 40,-10 39, -15')

You can also construct the primitive types from XML or JSON nodes that contain geospatial data in the supported formats. For example, the following XQuery code uses the geokml:box function to construct a cts:box from an XML element containing a KML LatLongBox.

xquery version "1.0-ml";
import module namespace geokml = "http://marklogic.com/geospatial/kml"
  at "/MarkLogic/geospatial/kml.xqy";

geokml:box(
  <LatLongBox xmlns="http://www.opengis.net/kml/2.2">
    <north>30</north>
    <south>12.5</south>
    <east>-122.24</east>
    <west>-127.24</west>
  </LatLongBox>)

Similarly, the following example uses geojson.box JavaScript function to construct a cts.box from a JSON node that contains a suitable GeoJSON polygon. For example:

var geojson = require('/MarkLogic/geospatial/geojson.xqy');

geojson.box(
  { type: 'Feature',
    bbox: [-180.0, -90.0, 180.0, 90.0],
    geometry: {
      type: 'Polygon',
      coordinates: [[
        [-180.0, 10.0], [20.0, 90.0], [180.0, -5.0], [-30.0, -90.0]
      ]]
  }}
)

For details and examples on these functions, see the MarkLogic XQuery and XSLT Function Reference or the MarkLogic Server-Side JavaScript Function Reference. These functions are complementary to the type constructors with the same names, which are described in XQuery Primitive Types And Constructors for Geospatial Queries.

Geospatial Query Support in Other APIs

The Search API enables geospatial queries through the following features:

For information on specific geospatial query options, see Appendix: Query Options Reference.

The Client APIs for REST, Java and Node.js applications provide similar support:

« Previous chapter
Next chapter »