XQuery/BBC Weather Forecast

From Wikibooks, open books for an open world
Jump to navigation Jump to search

BBC Weather forecasts[edit | edit source]

Some weather data is available from the BBC as RSS feeds. Currently this includes the current conditions and the 3-day forecast. Lacking a standard set of tags fro weather properties, the conditions are expressed in a string and string parsing is needed to access the elemental data.

For other forecasts such as the 24-hr and 5 day which are not available as RSS we must scrape the HTML page.

One approach to this task is this Yahoo Pipe which converts the page to an RSS feed. However the data would be more useful converted to XML elements.

Dates and times[edit | edit source]

In all these pages and feeds there is a problem to assign a date to a forecast or observation. Dates are often omitted or expressed as a day-of-the week. This leads to complications in processing both RSS and HTMl pages.

24-hour forecast[edit | edit source]

This script uses the eXist module httpclient to get the HTML, parses the HTML and generates an XML file. This XML could then be transformed via XSLT to a viewable page.

Interface[edit | edit source]

This script has two parameter:

  • region - required - a numeric code unique to the BBC (? code list)
  • area - optional - a sub region , typically the beginning of the postcode
declare namespace h ="http://www.w3.org/1999/xhtml";

declare function local:day-of-week($date) {
    ('Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat')
       [  xs:integer(($date - xs:date('1901-01-06'))
          div xs:dayTimeDuration('P1D')) mod 7

let $area := request:get-parameter("area",())
let $region := request:get-parameter("region","2")
let $url := concat ("http://news.bbc.co.uk/weather/forecast/",$region, "?state=fo:B", if (exists($area)) then concat("&area=",$area)  else ())
let $doc := httpclient:get(xs:anyURI($url),false(),())
let $currentDate := current-date()
let $currentTime := current-time()
let $dow := local:day-of-week($currentDate)
element forecasts {
          element region {$region},
          if (exists($area)) then element area {$area} else () ,
          element source {"BBC"},
  for $row in  $doc/httpclient:body//h:table/h:tbody/h:tr
  let $raw-time :=normalize-space($row/h:td[1])
  let $time := if (contains($raw-time," ")) then substring-before($raw-time," ") else $raw-time
  let $time := xs:time(concat($time,":00"))
  let $pdow := if (contains($raw-time,"(")) then substring-before(substring-after($raw-time,"("),")") else $dow
  let $date := if ($pdow ne $dow)   then $currentDate + xs:dayTimeDuration("P1D") else $currentDate
      element forecast {
         element date {$date},
         element time {$time},
         element dow {$pdow},
         element summary {string($row/h:td[2]//h:p[@class="sum"])},
         element imageurl {string($row/h:td[2]//h:div[@class="summary"]//h:img/@src)},
         element maxTemp{ attribute units {"degc"} , $row/h:td[3]//h:span[@class="cent"]/text()},
         element maxTemp {attribute units {"degf"} , $row/h:td[3]//h:span[contains(@class,"fahr")]/text()},
         element windDirection {string($row/h:td[4]//h:span[contains(@class,"wind")]/@title)},
         element windSpeed {attribute units {"mph"} , substring-before($row/h:td[4]//h:span[contains(@class,"mph")], "mph")},
         element windSpeed {attribute units {"kph"} ,substring-before($row/h:td[4]//h:span[contains(@class,"kph")], "km/h")},
         element humidity {attribute units {"%"}, normalize-space(substring-before($row/h:td[5]//h:span[contains(@class,"hum")], "%"))},
         element pressure  { attribute units {"mb"} , normalize-space(substring-before($row/h:td[5]//h:span[@class="pres"], "mB"))},
         element visibility {normalize-space($row/h:td[5]//h:span[contains(@class,"vis")])}

24 hour forecast for Bristol