SPARQL/Subqueries

From Wikibooks, open books for an open world
Jump to navigation Jump to search

SPARQL allows one SELECT query to be nested inside another. The inner SELECT query is called a subquery and is evaluated first. The subquery result variable(s) can then be used in the outer SELECT query.

Simplest example:

SELECT ?x ?y WHERE {
  VALUES ?x { 1 2 3 4 }
  {
    SELECT ?y WHERE { VALUES ?y { 5 6 7 8 }  }
  }  # \subQuery
} # \mainQuery

Try it!

The example below calculates the population of each country in the world, expressing the population as a percentage of the world's total population. In order to calculate the world's total population, it uses a subquery.

SELECT ?countryLabel ?population (round(?population/?worldpopulation*1000)/10 AS ?percentage)
WHERE {
  ?country wdt:P31 wd:Q3624078;    # is a sovereign state
           wdt:P1082 ?population.

  { 
    # subquery to determine ?worldpopulation
    SELECT (sum(?population) AS ?worldpopulation)
    WHERE { 
      ?country wdt:P31 wd:Q3624078;    # is a sovereign state
               wdt:P1082 ?population. 
    }
  }

  SERVICE wikibase:label {bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".}
}
ORDER BY desc(?population)

Try it!

The syntax of a query with a subquery is shown below. A subquery is basically the same as a simple query and is enclosed within { brackets }.

SELECT  ... query result variables ...
WHERE 
{
        ... query pattern ...

        { # subquery
          SELECT  ... subquery result variables ...
          WHERE 
          {
                  ... subquery pattern ...
          }
                  ... optional subquery modifiers ...
        } # end of subquery

}
        ... optional query modifiers ...

Subqueries can be used, often with a LIMIT, to avoid a query timeout by fractioning the task. As an example, this query is timing out :

#100 humans with exactly 6 months between their month of birthday and their month of death.
SELECT DISTINCT ?itemLabel ?item WHERE {
  ?item wdt:P31 wd:Q5 ;
        p:P569/psv:P569 [wikibase:timePrecision ?datePrecision1; wikibase:timeValue ?naissance] ;
        p:P570/psv:P570 [wikibase:timePrecision ?datePrecision2; wikibase:timeValue ?mort ].
  filter(?datePrecision1>10)
  filter(?datePrecision2>10)
  
  bind(month(?mort) - month(?naissance) as ?mois)
  bind(day(?mort) - day(?naissance) as ?jour)
  
  filter(abs(?mois) = 6)
  filter(?jour = 0)

  SERVICE wikibase:label {bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".}
}
ORDER BY ?itemLabel
LIMIT 100

Try it!

but the same query putting the limit on the selected items in a subquery and the label service outside it didn't timeout :

#100 humans with exactly 6 months between their month of birthday and their month of death.
SELECT DISTINCT ?itemLabel ?item WHERE {
  {
    SELECT DISTINCT ?item WHERE {
      ?item wdt:P31 wd:Q5 ; 
            p:P569/psv:P569 [wikibase:timePrecision ?datePrecision1; wikibase:timeValue ?naissance] ;
            p:P570/psv:P570 [wikibase:timePrecision ?datePrecision2; wikibase:timeValue ?mort ].

      filter(?datePrecision1>10)
      filter(?datePrecision2>10)

      bind(month(?mort) - month(?naissance) as ?mois)
      bind(day(?mort) - day(?naissance) as ?jour)
      filter(abs(?mois) = 6)
      filter(?jour = 0)
    }
    LIMIT 100
  }

  SERVICE wikibase:label {bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en".}
}
ORDER BY ?itemLabel

Try it!