XQuery/Pachube feed

From Wikibooks, open books for an open world
Jump to: navigation, search

Motivation[edit]

You want to create a feed for the Pachube application. A Pachube application allows you to store, share & discover realtime sensor, energy and environment data from objects, devices & buildings around the world. This provides a platform for sensor data integration. History gathered by Pachube can be presented in various formats and used by other applications to mashup feeds.

Modules and concepts[edit]

  • eXist httpclient to GET, POST and PUT
  • eXist scheduler for job scheduling
  • eXist update extension
  • server-side XSLT

Tower Bridge[edit]

The idea of a feed of the open/closed status of Tower Bridge in London was borrowed from @ni.

A Twitter stream provides the base data for a simple status feed. The RSS feed from this stream is read by an XQuery script, the status deduced from the text and an XML file representing the current status updated.

This XML file has an attached XSLT stylesheet so that when the file is pulled on schedule from the eXist database, it is first transformed on the server-side into the EEML format required for Pachube feeds. As configured on the UWE server, this uses Saxon XSLT-2.

XQuery script[edit]

let $rss := httpclient:get(xs:anyURI(
    "http://twitter.com/statuses/user_timeline/14012942.rss"
   ),false(),())/httpclient:body
let $lastChange:= $rss//item[1]
let $bridgeStatus :=  doc("/db/Wiki/Pachube/bridge.xml")/data/status
return 
  if (exists($lastChange) and exists($bridgeStatus))
  then
     let $open := if(contains($lastChange/description,"opening"))  then "1" else "0"
     return update replace $bridgeStatus with  
                element status {
                     attribute bridge {$open},
                     attribute lastChange {$lastChange/pubDate},
                     attribute lastUpdate {current-dateTime()}
                }
     else ()

1. httpclient is used here because doc() throws an error about duplicate namespace declarations - under investigation

Bridge status[edit]

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="http://www.cems.uwe.ac.uk/xmlwiki/Pachube/bridge.xsl"?>
<data>
    <status bridge="0" lastChange="Mon, 14 Dec 2009 12:09:02 +0000" lastUpdate="2009-12-14T16:57:00.679Z"/>
</data>

XSLT[edit]

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
    <xsl:output media-type="application/xml" method="xml" indent="yes"/>
    <xsl:template match="/data">
        <eeml xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.eeml.org/xsd/005" xsi:schemaLocation="http://www.eeml.org/xsd/005 http://www.eeml.org/xsd/005/005.xsd" version="5">
            <environment updated="{current-dateTime()}">
                <title>Tower Bridge </title>
                <feed>http://www.cems.uwe.ac.uk/xmlwiki/Pachube/bridge.xml</feed>
                <description>The status of the lifting Tower Bridge: 1 is open , 0 is closed. </description>
                <email>kit.wallace@gmail.com</email>
                <location exposure="outdoor" domain="physical" disposition="fixed">
                    <name>Tower Bridge</name>
                    <lat>51.5064186</lat>
                    <lon>-0.074865818</lon>
                </location>
                <data id="0">
                    <tag>bridge open</tag>
                    <value minValue="0" maxValue="1">
                        <xsl:value-of select="status/@bridge"/>
                    </value>
                </data>
            </environment>
        </eeml>
    </xsl:template>
</xsl:stylesheet>

Job scheduling[edit]

The XQuery update script is invoked by the eXist job scheduler every 1 minute:

let $login := xmldb:login( "/db", "user", "password" ) 
let $del := scheduler:delete-scheduled-job("BRIDGE")
let $job := scheduler:schedule-xquery-cron-job("/db/Wiki/Pachube/pollbridgerss.xq" , "0 0/1 * * * ?","BRIDGE")
return $job

Feed view[edit]

There is a public view of the Feed as processed by Pachube : http://www.pachube.com/feeds/3922

Discussion[edit]

The Pachube interface refreshes the automatic feeds every 15 minutes (for the free service). Since typical bridge lifts last 10 minutes, there is a likelihood that a lift will be missed. The alternative is to push changes to Pachube when detected.

Weather Push Feed[edit]

Many amateur weather stations use Weather Display software. This software writes current observations to a space-delimited text file to support interfaces to viewing software, such as the Flash Weather Display Live. The text files are generally web accessible, so that any client has access to this raw data, although it is polite to ask for access.

One such station is run by Martyn Hicks at http://www.martynhicks.co.uk/weather/data.php located in Horfield, Bristol. The raw data file is http://www.martynhicks.co.uk/weather/clientraw.txt

In this Push implementation, a manual feed is defined in Pachube via the API by POSTing a full EEML document. An XML descriptor file defines the mapping between values in the data file and data streams in the feed. A scheduled XQuery script reads the data file and transforms it via the mapping file to EEML format prior to PUTing to the Pachube API.


XQuery feed creation[edit]

The feed is defined in an EEML document which is POSTed to the Pachube API. A return code of 201 indicates that the feed has been created.

let $url := xs:anyURI("http://www.pachube.com/api/feeds/")
let $headers := 
        <headers>
            <header name="X-PachubeApiKey" value="...api key ...."/>
        </headers>
let $feed :=
          <eeml  xmlns="http://www.eeml.org/xsd/005">
            <environment updated="{current-dateTime()}">
                <title>Horfield Weather</title>
                <description>The weather observed by a weather station run by Martyn Hicks.  Public interface is http://www.martynhicks.co.uk/weather/data.php   </description>
                <email>kit.wallace@gmail.com</email>
                <location exposure="outdoor" domain="physical" disposition="fixed">
                    <name>Horfield</name>
                    <lat>51.4900</lat>
                    <lon>-2.5805</lon>
                </location>
            </environment>
        </eeml>
return
     httpclient:post($url,$feed,false(),$headers)

Mapping File[edit]

An XML document defines the origin of the raw data, the Pachube appid and the mapping from data values (1-based) to data streams (numbered from 1 in document order).

<weatherfeed xmlns = "http://www.cems.uwe.ac.uk/xmlwiki/wdl">
    <data>http://www.martynhicks.co.uk/weather/clientraw.txt</data>
    <appid>4013</appid>
    <format>
        <field n="2" unit="kts">Average Wind Speed</field>
        <field n="4" unit="degrees">Wind Direction</field>
        <field n="5" unit="Celcius">Temperature</field>
        <field n="7" unit="hPa">Barometer</field>
    </format>
</weatherfeed>

Update script[edit]

This script is scheduled to run every minute (as above).

The mapping file namespace needs to be declared:

declare namespace wdl = "http://www.cems.uwe.ac.uk/xmlwiki/wdl";

First a function to read the raw data file and tokenize to a sequence of values:

declare function local:client-data ($rawuri) {
 let $headers :=
    element headers{
       element header {
         attribute name {"Cache-Control"},
         attribute value {"no-cache"}
      }
    }
 let $raw := httpclient:get(xs:anyURI($rawuri),false(),$headers )/httpclient:body
 return tokenize($raw,"\+")
};

Then a function to transform from the sequence of values to the Pachube data chanels:

declare function local:data-to-eeml ($data,$format) {
         for $field at $id in $format/wdl:field
         let $name := string($field)
         let $index := xs:integer($field/@n)
         return 
            element data {
                attribute id {$id},
                element tag { string($field)},
                element value {$data[$index] },
                element unit {string($field/@unit)}
            }
};

The main line fetches the feed definition file (here hard-coded but it could be passed in as a parameter). The data values are obtained, the EEML generated and PUT to the Pachube API.

let $feed := doc("/db/Wiki/Pachube/horfieldweather.xml")/wdl:weatherfeed
let $data := local:client-data($feed/wdl:data)
let $appid := $feed/wdl:appid
let $APIKey := "eeda7c27ff8b7c49e8529e4eb4b3f57724c5b609db0d22904df11edd4742e92c"
let $url := xs:anyURI(concat( "http://www.pachube.com/api/",$appid))
let $headers := 
        <headers>
            <header name="X-PachubeApiKey" value="{$APIKey}"/>
        </headers>
let $eeml:=
     <eeml xmlns="http://www.eeml.org/xsd/005">
          <environment updated="{current-dateTime()}">          
           {local:data-to-eeml($data,$feed/wdl:format)}
          </environment>
   </eeml>
return
       httpclient:put($url,$eeml,false(),$headers)

Feed view[edit]

There is a public view of the Pachube feed at http://www.pachube.com/feeds/4013


Weather Pull Feed[edit]

The alternative approach is simpler and relies on Pachube to pull data on their schedule.

In this example, weather station data consolidated by WeatherUnderground and republished as XML is transformed to EEML.

Weatherundgerground Feed[edit]

A typical XML feed for a station in weatherunderground is http://api.wunderground.com/weatherstation/WXCurrentObXML.asp?ID=IBAYOFPL1

XSLT transform[edit]

This XML can be transformed to EEML using XSLT:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
    <xsl:output media-type="application/xml" method="xml" indent="yes"/>
    <xsl:template match="/current_observation">
        <eeml xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
            xmlns="http://www.eeml.org/xsd/005"
            xsi:schemaLocation="http://www.eeml.org/xsd/005 http://www.eeml.org/xsd/005/005.xsd"
            version="5">
            <environment updated="{current-dateTime()}">
                <title>Weather Report</title>
                <location exposure="outdoor" domain="physical" disposition="fixed">
                    <name>
                        <xsl:value-of select="location/full"/>
                    </name>
                    <lat>
                        <xsl:value-of select="location/latitude"/>
                    </lat>
                    <lon><xsl:value-of select="location/longitude"/>
                    </lon>
                </location>
                <data id="1">
                    <tag>Average Wind speed</tag>
                    <value>
                        <xsl:value-of select="round-half-to-even(wind_mph * 1.15077945,1)"/>
                    </value>
                    <unit>kts</unit>
                </data>
                <data id="2">
                    <tag>Wind Direction</tag>
                    <value>
                        <xsl:value-of select="wind_degrees"/>
                    </value>
                    <unit>degrees</unit>
                </data>
                <data id="3">
                    <tag>Temperature</tag>
                    <value>
                        <xsl:value-of select="temp_c"/>
                    </value>
                    <unit>Celcius</unit>
                </data>
                <data id="4">
                    <tag>Barometric Pressure</tag>
                    <value>
                        <xsl:value-of select="pressure_mb"/>
                    </value>
                    <unit>hPA</unit>
                </data>
            </environment>
        </eeml>
    </xsl:template>
</xsl:stylesheet>

Connecting XML to XSLT[edit]

A simple XQuery script accepts one parameter, the station id, fetches the XML feed and transforms using XSLT to EEML:

let $id := request:get-parameter("id",())
let $ss := doc("/db/Wiki/Pachube/weatherunderground.xsl")
let $data := doc(concat("http://api.wunderground.com/weatherstation/WXCurrentObXML.asp?ID=",$id))
return
  transform:transform($data,$ss,())

The script can be invoked: http://www.cems.uwe.ac.uk/xmlwiki/Pachube/weatherunderground.xq?id=IBAYOFPL1.

Since this script is parameterised, it could be used with any weatherUnderground station.

Pachube Feed[edit]

An automatic feed can be created - http://www.pachube.com/feeds/4037 which uses this feed.


NOAA Feed[edit]

We can adopt a similar approach with the feeds for US ICAO stations. NOAA provide XML feeds such as http://www.weather.gov/xml/current_obs/KEWR.xml . The format is nearly the same as the weatherunderground feed and is documented: http://www.weather.gov/view/current_observation.xsd. Update rate is hourly but there is no way currently to configure Pachube to update at that frequency.

XSLT[edit]

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
    <xsl:output media-type="application/xml" method="xml" indent="yes"/>
    <xsl:template match="/current_observation">
        <eeml xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
            xmlns="http://www.eeml.org/xsd/005"
            xsi:schemaLocation="http://www.eeml.org/xsd/005 http://www.eeml.org/xsd/005/005.xsd"
            version="5">
            <environment updated="{current-dateTime()}">
                <title>NOAA Weather Report</title>
                <location exposure="outdoor" domain="physical" disposition="fixed">
                    <name>
                        <xsl:value-of select="location"/>
                    </name>
                    <lat>
                        <xsl:value-of select="latitude"/>
                    </lat>
                    <lon><xsl:value-of select="longitude"/>
                    </lon>
                </location>
                <data id="1">
                    <tag>Average Wind speed</tag>
                    <value>
                       <xsl:value-of select="wind_kt"/>
 
                    </value>
                    <unit>kts</unit>
                </data>
                <data id="2">
                    <tag>Wind Direction</tag>
                    <value>
                        <xsl:value-of select="wind_degrees"/>
                    </value>
                    <unit>degrees</unit>
                </data>
                <data id="3">
                    <tag>Temperature</tag>
                    <value>
                        <xsl:value-of select="temp_c"/>
                    </value>
                    <unit>Celcius</unit>
                </data>
                <data id="4">
                    <tag>Barometric Pressure</tag>
                    <value>
                        <xsl:value-of select="pressure_mb"/>
                    </value>
                    <unit>hPA</unit>
                </data>
            </environment>
        </eeml>
    </xsl:template>
</xsl:stylesheet>

XQuery Script[edit]

let $id := request:get-parameter("id",())
let $ss := doc("/db/Wiki/Pachube/NOAA.xsl")
let $data := doc(concat("http://www.weather.gov/xml/current_obs/",$id,".xml"))
return
  transform:transform($data,$ss,())

Feed[edit]

The transformed XML http://www.cems.uwe.ac.uk/xmlwiki/Pachube/NOAA.xq?id=KEWR is the basis for the manual feed http://www.pachube.com/feeds/4047

XSLT only[edit]

If Pachube supported XSLT on the server side, the whole task could be handled by a single XSLT script. For the sake of generalisation, its helpful to provide an interface which allows parameters to be passed to the script but it is not necessary:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
    <xsl:output media-type="application/xml" method="xml" indent="yes"/>
    <xsl:param name="station" select="'KEWR'"/>
    <xsl:template match="/">
        <xsl:variable name="url" select='concat("http://www.weather.gov/xml/current_obs/",$station,".xml")'></xsl:variable>
         <xsl:apply-templates select="doc($url)/current_observation"/>
     </xsl:template>
    <xsl:template match="current_observation">
         <eeml xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
            xmlns="http://www.eeml.org/xsd/005"
            xsi:schemaLocation="http://www.eeml.org/xsd/005 http://www.eeml.org/xsd/005/005.xsd"
            version="5">
            <environment updated="{current-dateTime()}">
                <title>NOAA Current weather for {station_id}</title>
                <location exposure="outdoor" domain="physical" disposition="fixed">
                    <name>
                        <xsl:value-of select="location"/>
                    </name>
                    <lat>
                        <xsl:value-of select="latitude"/>
                    </lat>
                    <lon>
                        <xsl:value-of select="longitude"/>
                    </lon>
                </location>
                <data id="1">
                    <tag>Average Wind speed</tag>
                    <value>
                        <xsl:value-of select="wind_kt"/>
                    </value>
                    <unit>kts</unit>
                </data>
                <data id="2">
                    <tag>Wind Direction</tag>
                    <value>
                        <xsl:value-of select="wind_degrees"/>
                    </value>
                    <unit>degrees</unit>
                </data>
                <data id="3">
                    <tag>Temperature</tag>
                    <value>
                        <xsl:value-of select="temp_c"/>
                    </value>
                    <unit>Celcius</unit>
                </data>
                <data id="4">
                    <tag>Barometric Pressure</tag>
                    <value>
                        <xsl:value-of select="pressure_mb"/>
                    </value>
                    <unit>hPA</unit>
                </data>
            </environment>
        </eeml>        
    </xsl:template>
</xsl:stylesheet>

The server can just run this standalone to generate the EEML feed. This small XQuery script uses the SAXON processor on the eXist platform:

  transform:transform((),doc("/db/Wiki/Pachube/NOAA3.xsl"),())

XSLT Execute (currently fails - under investigation)

XSLT Feed service[edit]

Automatic Feed[edit]

The XSLT conversion could be provided as a service by an eXist db. It would need:

  • an XML database of connection definitions e.g.
<PachubeFeeds>
    <PachubeFeed id="2001">
        <data>http://www.weather.gov/xml/current_obs/KORS.xml</data>
        <xslt>http://www.cems.uwe.ac.uk/xmlwiki/Pachube/NOAA7.xsl</xslt>
        <params/>
    </PachubeFeed>
    <PachubeFeed id="2002">
        <data>http://www.weather.gov/xml/current_obs/KEWR.xml</data>
        <xslt>http://www.cems.uwe.ac.uk/xmlwiki/Pachube/NOAA7.xsl</xslt>
        <params/>
    </PachubeFeed>
</PachubeFeeds>
  • a script to locate and execute a feed:
let $id := request:get-parameter("id",())
let $feed := doc("/db/Wiki/Pachube/feeds.xml")//PachubeFeed[@id=$id]
return 
 transform:transform(
     doc($feed/data),
     doc($feed/xslt),
     $feed/params
     )

  • Pachube automatic feeds can now be created with a URL like
  http://www.cems.uwe.ac.uk/xmlwiki/Pachube/getFeed.xq?id=2001   

e.g. http://www.pachube.com/feeds/4661

  • User interface and database to allow users to register, create and edit feeds

There are issues here with loading and with unsafe code in the stored XSLT.

Output[edit]

Similarly output processing of either the current EEML or a specific datastream's csv history could be provided with a bit of code and XSLT. Since this may require authentication, API keys would have to be stored on this database too. Jobs could be generated and scheduled to implement triggers but this will need a timed pull of the required data.

Code is needed to convert the history feeds provided by Pachube to XML since these are only available in CSV. Once in XML, XSLT can transform to the format required. Of course it would be preferable if Pachube provided XML feeds in addition to the CSV feeds.

Archive[edit]

The full archive is provided as a csv file. We can convert that to XML with the following script:

import module namespace csv = "http://www.cems.uwe.ac.uk/xmlwiki/csv" at "../lib/csv.xqm";

let $feed := request:get-parameter("feed","")
let $stream := request:get-parameter("stream","")
let $archiveurl := concat("http://www.pachube.com/feeds/",$feed,"/datastreams/",$stream,"/archive.csv")
let $data:= csv:get-data($archiveurl)
let $rows :=  tokenize($data,$csv:newline)
let $now := current-dateTime()
return
<history feed="{$feed}" stream="{$stream}" dateTime="{$now}"  count="{count($rows)}">
{for $row in $rows
let $point := tokenize($row,",")
return
  <value dateTime="{$point[1]}">{$point[2]}</value>
}
</history>

http://www.cems.uwe.ac.uk/xmlwiki/Pachube/getArchive.xq?feed=4037&stream=2

24 Hour History[edit]

In the csv stream, these are untimed. The time has to be estimated and calculated using xs:dateTimeDuration:

import module namespace csv = "http://www.cems.uwe.ac.uk/xmlwiki/csv" at "../lib/csv.xqm";

let $feed := request:get-parameter("feed","")
let $stream := request:get-parameter("stream","")
let $historyurl := concat("http://www.pachube.com/feeds/",$feed,"/datastreams/",$stream,"/history.csv")
let $data:= csv:get-data($historyurl)
let $values :=tokenize($data,",")
let $now := current-dateTime()
let $then := $now - xs:dayTimeDuration("P1D")
return
<history feed="{$feed}" stream="{$stream}" dateTime="{$now}"  count="{count($values)}">
{for $value at $i in $values
let $dt := $then + xs:dayTimeDuration(concat("PT",15*$i,"M"))
return
  <value dateTime="{$dt}">{$value}</value>
}
</history>

http://www.cems.uwe.ac.uk/xmlwiki/Pachube/getHistory.xq?feed=4037&stream=2