Knowledge Engineering for Historians/Primer Basic Elements

From Wikibooks, open books for an open world
Jump to navigation Jump to search

The Basic Elements[edit | edit source]

Knowledge Engineering's basic tools are entities and relations.

Entities[edit | edit source]

An entity is something like a concept — e.g. the feudal system — or an individual — St. Peter's Cathedral, Charlemagne, etc. Some entities are timeless and self-denoting, like numbers or names; others have complex histories, like the two entities we just mentioned.

Before we look at some examples, a brief comment on notation: For the purposes of the examples presented here, we will notate entities with capital letters and relations with initial lowercase letters. And we will use double-semicolons — ;; — to denote comments in our representation; so anything that follows a double-semicolon mark to the end of that line is a comment and can contain arbitrary characters.

That said, let's look at some examples:

;; a couple of entities from the 4th century AD
Trier-RomanCity
ConstantineTheGreat
HelenaOfConstantinople

Here we have identified three entities: the Roman city of Trier, Germany; the 4th century AD Roman Emperor Constantine the Great; and his mother, the later Augusta Helena of Constantinople.

Relations[edit | edit source]

Definition and Arity[edit | edit source]

A relation is an entity in its own right (or at least should be) and is used to connect (usually) two entities together. More precisely, the number of entities that a relation can connect together is called the arity of the relation.

;; something about our entities
yearOfBirth(ConstantineTheGreat,272)
motherOf(ConstantineTheGreat,HelenaOfConstantinople)
residence(ConstantineTheGreat,Trier-RomanCity)

Assigning Type Information[edit | edit source]

While one might suspect that relations that connect two entities (so-called binary relations) be the most useful, there are indeed interesting uses for relations with an arity of one (unary relations), namely stating that an entity belongs to a specific class or type:

;; some type information about our entities 
male(ConstantineTheGreat)
female(HelenaOfConstantinople)
city(Trier)

Notice that not all ontologies handle type information this way; indeed, we could treat "maleness" as an entity in its own right and then use a binary relation to attribute this to emperor Constantine.

;; for those who dont like unary relations
is-a(ConstantineTheGreat,Male)
is-a(HelenaOfConstantinople,Female)
is-a(Trier-RomanCity,City)

The advantage of doing things this way is that concepts like city really feel like they ought to be entities in their own right. However, this is not true for all types. Indeed, some types look odd when treated as an entity:

;; which looks better? many think the first one does
red(MyFerrari)
is-a(MyFerrari,RedObject)

Some representation languages choose to not treat relations as entities in their own right (e.g. frame-based systems), and this imputes a preference for assigning type information with binary predicates. However, as we stated above, there is really no excuse for not treating relations as entities in their own right!

Arity: How Much is Too Much?[edit | edit source]

For some representation languages, the maximum arity of its relations may be an implementation dependent choice. This is defensible insofar as, theoretically speaking, it is always possible to construct the equivalent of an n-ary relationship using only binary relations provided on elevates the relationship itself to the level of an entity.

;; Henry VIII and his wives, using a 7-ary relation
hadAsWives(HenryVIIIofEngland,CatherineOfAragon,AnneBoleyn,JaneSeymour,AnneOfCleves,CatherineHoward,CatherineParr)

;; gets away with just binary relations
memberOf(TheWivesOfHenryVIII,CatherineOfAragon)
memberOf(TheWivesOfHenryVIII,AnneBoleyn)
memberOf(TheWivesOfHenryVIII,JaneSeymour)
memberOf(TheWivesOfHenryVIII,AnneOfCleves)
memberOf(TheWivesOfHenryVIII,CatherineHoward)
memberOf(TheWivesOfHenryVIII,CatherineParr)

This approach is sometimes called "Davidsonian Representation" after philosopher Donald Davidson, an issue we will discuss in more detail later. Of course, one might have tried the following, but that does not capture the commonality that these ladies (somewhat inadvertantly) shared in quite the same way:

;; uses a another binary relations
spouses(HenryVIII,CatherineOfAragon)
spouses(HenryVIII,AnneBoleyn)
spouses(HenryVIII,JaneSeymour)
spouses(HenryVIII,AnneOfCleves)
spouses(HenryVIII,CatherineHoward)
spouses(HenryVIII,CatherineParr)

For it would not allow us to attribute any information to the relation as such:

;; this is only possible by making the relationship itself an entity
nameOfTopic(TheWivesOfHenryVIII,"The Six Wives of Henry VIII")
themeOfTVShow(TheWivesOfHenryVIII,TheSixWivesOfHenryVIII-PBSSpecial)

Types of Relations[edit | edit source]

Now that we have seen a couple of relations, it is clear that they come in rather different flavors. Let's put some really different kinds and discuss what makes their differences:

;; some place information -- true though maybe not a deep insight
locationOf(Trier-RomanCity,Germany)  
locationOf(Trier-RomanCity,Trier-RomanCity)

Relations like locationOf are called reflexive; they have the property that they also (usually trivially) relate the first argument to itself. Not surprisingly, reflexive relations are always binary (what's the point in relating a thing to itself more than once?).

;; a bit of dynastic knowledge
childOf(HenryVIII,HenryVII)
childOf(MaryI,HenryVIII) 

These dynastic relations are the opposite of reflexive, they are irreflexive.

;; more generally speaking
descendant(HenryVIII,HenryVII)
descendant(MaryI,HenryVIII) 

With this example, the interesting property of the relation is that — without explicitly saying it — we know that Queen Mary I is also a descendant of king Henry VII (he is after all her grandfather). Such relations are called transitive, which means that the property that they are conveying can be transferred from one entity to another via an intermediary that they share relationship with. childOf for example is not transitive, or more technically intransitive.

;; more dynastic details
spouses(HenryVIII,CatherineParr)

The relation spouses is interestingly different from the other relations in that the argument order conveys no information. Such relations are called symmetric and that just means that one can write the arguments in either order without changing what one is saying. This is definitely not true for childOf or parentOf — here the argument order is critical.

Interestingly, there is a small group of relations that combine these three features — reflexivity, symmetry and transitivity — and these are called equivalence relations — possibly because equals is a classic example for one of these relations.

The features that type relations are not restricted to considering the relation alone; one can also identify features that relate two relations to each other.

;; another dynastic variation
husbandOf(HenryVIII,CatherineParr)
wifeOf(CatherineParr,HenryVIII)

This asymmetric variation on the marriage information is interesting because these two relations convey opposite views of the same information. We call such relations inverses of each other, because one needs to invert the argument position of the entities to say the same thing.