xdmp:http-get( $uri as xs:string, [$options as (element()|map:map)?] ) as item()+
Sends the http GET method to the specified URI. Returns the http response as well as whatever information is identified by the specified URI (for example, an html document).
Parameters | |
---|---|
uri | The URI of the requested document. |
options |
Options with which to customize this operation.
You can specify options as either an XML element
in the "xdmp:http" namespace, or as a map:map . The
options names below are XML element localnames. When using a map,
replace the hyphens with camel casing. For example, "an-option"
becomes "anOption" when used as a map:map key.
This function supports the following options, plus certain options from the
xdmp:document-load
and
xdmp:document-get
functions. For example, you can use the repair and
encoding options from these functions.
When including an option from another
function in an XML options node, use the namespace appropriate to
that function in the option element.
|
http://marklogic.com/xdmp/privileges/xdmp-http-get
The http functions only operate on URIs that use the http or https
schemes; specifying a URI that does not begin with http://
or https://
throws an exception.
If an http function times out, it throws a socket received exception (SVC-SOCRECV).
An automatic encoding detector will be used if the value auto
is specified for the encoding
option
(in the xdmp:document-get
namespace). If no option is specified, the encoding defaults to
the encoding specified
in the http header. If there is no encoding in the http header, the encoding
defaults to UTF-8.
The first node in the output of this function is the response header from the http server.
The second node in the output of this function is the
response from the http server. The response is treated as
text, XML, JSON or binary, depending on the content-type header sent from
the http server. If the node is html, the header should indicate
text/html
, which is returned as a text document by default.
The type of document is determined by the mimetypes mappings, and
you can change the mappings in the Admin Interface as needed.
If you happen to know that the response is XML, even
if the header
does not specify it as XML, and want to process the response as XML,
you can wrap the response in an xdmp:unquote
call to
parse the response as XML. You could also use the
<format>xml</format>
option (in the
xdmp:document-get
namespace) to tell the API to treat the
document as XML. Also, if you know the response is an HTML document,
you can wrap the response in an xdmp:tidy
call, which
will treat the text as HTML, clean it up, and return an XHTML XML
document.
Note that for "options", you can pass it in also as a map:map. Each map entry represents one option and the naming convention of the options is the same as the one used when calling the function from JavaScript.
To use this function with a proxy, you need to translate the URI to the proxy uri. For example:
declare function local:http-get-proxy($proxy, $uri) { let $host := fn:tokenize($uri,'/')[3] (: you might need to modify the next line based on your proxy server config :) let $proxyuri := fn:concat($proxy,$uri) return xdmp:http-get($proxyuri, <options xmlns="xdmp:http"> <headers> <Host>{$host}</Host> </headers> </options>) }; local:http-get-proxy('http://some.proxy.com:8080','http://www.google.com')
If you use the credential-id
option, you can use
xdmp:credential-id to obtain
the id of a previously stored credential. For example:
xdmp:http-get($someuri <options xmlns="xdmp:http"> <credential-id>{xdmp:credential-id("my-credential-name)}</credential-id> </options>)
xdmp:http-get("http://www.my.com/document.xhtml", <options xmlns="xdmp:http"> <authentication method="basic"> <username>myname</username> <password>mypassword</password> </authentication> </options>) => the response from the server as well as the specified document
xdmp:http-get("https://s3.amazonaws.com/marklogic-lambda-us-east-1/", <options xmlns="xdmp:http"> <authentication method="aws4"> <username>myname</username> <password>mypassword</password> </authentication> <headers> <x-amz-content-sha256>{xdmp:sha256("")}</x-amz-content-sha256> </headers> </options> ) => the response from the server as well as the specified document
xdmp:http-get("http://www.my.com/iso8859document.html", <options xmlns="xdmp:document-get"> <encoding>iso-8859-1</encoding> </options>)[2] => The specified document, transcoded from ISO-8859-1 to UTF-8 encoding. This assumes the document is encoded in ISO-8859-1. Note that the encoding option is in the "xdmp:document-get" namespace.
xdmp:unquote( xdmp:http-get("http://www.my.com/somexml.xml")[2]) => The specified xml document, parsed as XML by xdmp:unquote. If the header specifies a mimetype that is configured to be treated as XML, the xdmp:unquote call is not needed. Alternately, you can treat the response as XML by specifying XML in the options node as follows (note that the format option is in the "xdmp:document-get" namespace: xdmp:http-get("http://www.my.com/somexml.xml", <options xmlns="xdmp:http-get"> <format xmlns="xdmp:document-get">xml</format> </options>)[2]
xdmp:tidy( xdmp:http-get("http://www.my.com/somehtml.html")[2])[2] => The specified html document, cleaned and transformed to xhtml by xdmp:tidy. The second node of the tidy output is the xhtml node (the first node is the status). You could then perform XPath on the output to return portions of the document. Note that the document (and all of its elements) will be in the XHTML namespace, so you need to specify the namespace in the XPath steps. For example: xquery version "1.0-ml"; declare namespace xh="http://www.w3.org/1999/xhtml"; xdmp:tidy( xdmp:http-get("http://www.my.com/somehtml.html")[2])[2]//xh:title
Stack Overflow: Get the most useful answers to questions from the MarkLogic community, or ask your own question.