Cognitive Psychology and Cognitive Neuroscience/Imagery
|Previous Chapter||Overview||Next Chapter|
Note: Some figures are not included yet because of issues concerning their copyright.
Introduction & History
Imagine yourself being on vacation. It is already evening and you are sitting at the beach, watching the sun setting over the ocean. A warm summer breeze tickels your skin. You look at the horizon and try to imagine what the world was like, when they thought that beyond that ocean there is only the rim of the world. Suddenly, Knut walks by and reminds you of your task to find out what imagery is. In sheer surprise you wake up and continue reading this article.
This chapter deals with exactly the phenomenon you just experienced: Mental imagery. It resembles perceptual experience but occurs in the absence of external stimuli. Very often, imagery experiences are understood by their subjects as echoes, copies, or reconstructions of actual perceptual experiences from their past, while at other times they may seem to anticipate possible, often desired or feared future experiences. Though imagery can occur with respect to sensory modalities like acoustic perception and even emotional feeling, the majority of research was actually done on the topic of visual imagery, on which we are going to focus as well.
Mental imagery was already discussed by the early Greek philosophers. Socrates sketched a relation between perception and imagery by assuming that visual sensory experience creates images in the human's mind, which are representations of the real world. Later on, Aristoteles stated that "thought is impossible without an image". At the beginning of the 18th century, Bishop Berkeley proposed another role of mental images - similar to the ideas of Sokrates - in his theory of idealism. He assumed that our whole perception of the external world consists only of mental images.
At the end of the 19th century Wilhelm Wundt - the generally acknowledged founder of experimental psychology and cognitive psychology - called imagery, sensations and feelings the basic elements of consciousness. Furthermore, he had the idea that the study of imagery supports the study of cognition because thinking is often accompanied by images. This remark was taken up by some psychologists and gave rise to the imageless-thought debate, which discussed the same question Aristoteles already had asked: Is thought possible without imagery?
In the early 20th century, when Behaviourism became the main stream of psychology, Watson argued that there is no visible evidence of images in human brains and therefore, the study of imagery is worthless. This general attitude towards the value of research on imagery did not change until the birth of cognitive psychology in the 1950s and -60s.
Later on, imagery has often been believed to play a very large, even pivotal, role in both memory (Yates, 1966; Paivio, 1986) and motivation (McMahon, 1973). It is also commonly believed to be centrally involved in visuo-spatial reasoning and inventive or creative thought.
The Imagery Debate
Imagine yourself back on vacation again. You are now walking along the beach, while projecting images of white benzene-molecules onto the horizon. At once you are realizing that there are two real little white dots under your projection. Couriously you are walking towards them, until your visual field is filled by two seriously looking, but fiercely debating scientists. As they take notice of your presence, they invite you to take a seat and listen to the still unsolved imagery debate.
Today’s imagery debate is mainly influenced by two opposing theories: On the one hand Zenon Pylyshyn’s (left) propositional theory and on the other hand Stephen Kosslyn’s (right) spatial representation theory of imagery processing.
Theory of propositional representation
The theory of Propositional Representation was founded by Dr. Zenon Pylyshyn who invented it in 1973. He described it as an epiphenomenon which accompanies the process of imagery, but is not part of it. Mental images do not show us how the mind works exactly. They only show us that something is happening. Just like the display of a compact disc player. There are flashing lights that display that something happens. We are also able to conclude what happens, but the display does not show us how the processes inside the compact disc player work. Even if the display would be broken, the compact disc player would still continue to play music.
The basic idea of the propositional representation is that relationships between objects are representated by symbols and not by spatial mental images of the scene. For example, a bottle under a table would be represented by a formula made of symbols like UNDER(BOTTLE,TABLE). The term proposition is lend from the domains of Logic and Linguistics and means the smallest possible entity of information. Each proposition can either be true or false.
If there is a sentence like "Debby donated a big amount of money to Greenpeace, an organisation which protects the environment", it can be recapitulated by the propositions "Debby donated money to Greenpeace", "The amount of money was big" and "Greenpeace protects the environment". The truth value of the whole sentence depends on the truth values of its constituents. Hence, if one of the propositions is false, so is the whole sentence.
This last model does not imply that a person remembers the sentence or its single propositions in its exact literal wording. It is rather assumed that the information is stored in the memory in a propositional network.
In Figure 1 each circle represents a single proposition. Regarding the fact that some components are connected to more than one proposition, they construct a network of propositions. Propositional networks can also have a hierarchy, if a single component of a proposition is not a single object, but a proposition itself. An example of a hierarchical propositional network describing the sentence "John believes that Anna will pass her exam" is illustrated in Figure 2.
Complex objects and schemes
Even complex objects can be generated and described by propositional representation. A complex object like a ship would consist of a structure of nodes which represent the ships properties and the relationship of these properties.
Almost all humans have concepts of commonly known objects like ships or houses in their mind. These concepts are abstractions of complex propositional networks and are called schemes. For example our concept of a house includes propositions like:
Houses have rooms. Houses can be made from wood. Houses have walls. Houses have windows. ...
Listing all of these propositions does not show the structure of relationships between these propositions. Instead, a concept of something can be arranged in a schema consisting of a list of attributes and values, which describe the properties of the object. Attributes describe possible forms of categorisation, while values rep- resent the actual value for each attribute. The schema-representation of a house looks like this:
House Category: building Material: stone, wood Contains: rooms Function: shelter for humans Shape: rectangular ...
The hierarchical structure of schemes is organised in categories. For example, "house" belongs to the category "building" (which has of course its own schema) and contains all attributes and values of the parent schema plus its own specific values and attributes. This way of organising objects in our environment into hierarchical models enables us to recognise objects we have never seen before in our life, because they can possibly be related to categories we already know.
In an experiment performed by Wisemann und Neissner in 1974, people are shown a picture which, on first sight, seems to consist of random black and white shapes. After some time the subjects realise that there is a dalmatian dog in it. The results of this show that people who recognise the dog remember the picture better than people who do not recognise him. An possible explanation is that the picture is stored in the memory not as a picture, but as a proposition.
In an experiment by Weisberg in 1969 subjects should memorise sentences like "Children who are slow eat bread that is cold". Then the subjects were asked to associate the first word from the sentence that comes in their mind to a word given by the experiment conductor. Almost all subjects associated the word "children" to the given word "slow", although the word "bread" has a position that is more close to the given word "slow" than the word "children". An explanation for this is that the sentence is stored in the memory using the three propositions "Children are slow", "Children eat bread" and "Bread is cold". The subjects associated the word "children" with the given word "slow", because both belong to one proposition, while "bread" and "slow" belong to different ones. The same evidence was proven in another experiment by Ratcliff and McKoon in 1978.
Theory of spatial representation
Stephen Kosslyn's theory opposing Pylyshyn's propositional approach implies that images are not only represented by propositions. He tried to find evidence for a spatial representation system that constructs mental, analogous, three-dimensional models.
The primary role of this system is to organize spatial information in a general form that can be accessed by either perceptual or linguistic mechanisms. It also provides coordinate frameworks to describe object locations, thus creating a model of a perceived or described environment. The advantage of a coordinate representation is that it is directly analogous to the structure of real space and captures all possible relations between objects encoded in the coordinate space. These frameworks also reflect differences in the salience of objects and locations consistent with the properties of the environment, as well as the ways in which people interact with it. Thus, the representations created are models of physical and functional aspects of the environment.
What, then, can be said about the primary components of cognitive spatial representation? Certainly, the distinction between the external world and our internal view of it is essential, and it is helpful to explore the relationship between the two further from a process-oriented perspective.
The classical approach assumes a complex internal representation in the mind that is constructed through a series of specific perceived stimuli, and that these stimuli generate specific internal responses. Research dealing specifically with geographic-scale space has worked from the perspective that the macro-scale physical environment is extremely complex and essentially beyond the control of the individual. This research, such as that of Lynch and of Golledge (1987) and his colleagues, has shown that there is a complex of behavioural responses generated from corresponding complex external stimuli, which are themselves interrelated. Moreover, the results of this research offers a view of our geographic knowledge as a highly interrelated external/internal system. Using landmarks encountered within the external landscape as navigational cues is the clearest example of this interrelationship.
The rationale is as follows: We gain information about our external environment from different kinds of perceptual experience; by navigating through and interacting directly with geographic space as well as by reading maps, through language, photographs and other communication media. Within all of these different types of experience, we encounter elements within the external world that act as symbols. These symbols, whether a landmark within the real landscape, a word or phrase, a line on a map or a building in a photograph, trigger our internal knowledge representation and generate appropriate responses. In other words, elements that we encounter within our environment act as external knowledge stores.
Each external symbol has meaning that is acquired through the sum of the individual perceiver's previous experience. That meaning is imparted by both the specific cultural context of that individual and by the specific meaning intended by the generator of that symbol. Of course, there are many elements within the natural environment not "generated" by anyone, but that nevertheless are imparted with very powerful meaning by cultures (e.g. the sun, moon and stars). Man-made elements within the environment, including elements such as buildings, are often specifically designed to act as symbols as at least part of their function. The sheer size of downtown office buildings, the pillars of a bank facade and church spires pointing skyward are designed to evoke an impression of power, stability or holiness, respectively.
These external symbols are themselves interrelated, and specific groupings of symbols may constitute self-contained external models of geographic space. Maps and landscape photographs are certainly clear examples of this. Elements of differing form (e.g., maps and text) can also be interrelated. These various external models of geographic space correspond to external memory. From the perspective just described, the total sum of any individual's knowledge is contained in a multiplicity of internal and external representations that function as a single, interactive whole. The representation as a whole can therefore be characterised as a synergistic, self-organising and highly dynamic network.
Early experiments on imagery were already done in 1910 by Perky. He tried to find out, if there is any interaction between imagery and perception by a simple mechanism. Some subjects are told to project an image of common objects like a ship onto a wall. Without their knowledge there is a back projection, which subtly shines through the wall. Then they have to describe this picture, or are questioned about for example the orientation or the colour of the ship. In Perkys experiment, none of the 20 subjects recognised that the description of the picture did not arise from their mind, but were completely influenced by the picture shown to them.
Another seminal research in this field were Kosslyn's image-scanning experiments in the 1970s. Referring to the example of the mental representation of a ship, he experienced another linearity within the move of the mental focus from one part of the ship to another. The reaction time of the subjects increased with distance between the two parts, which indicates, that we actually create a mental picture of scenes while trying to solve small cognitive tasks. Interestingly, this visual ability can be observed also with congenitally blind, as Marmor and Zaback (1976) found out. Presuming, that the underlying processes are the same of sighted subjects, it could be concluded that there is a deeper encoded system that has access to more than the visual input.
Mental Rotation Task
Other advocates of the spatial representation theory, Shepard and Metzler, developed the mental rotation task in 1971. Two objects are presented to a participant in different angles and his job is to decide whether the objects are identical or not. The results show that the reaction times increases linearly with the rotation angle of the objects. The participants mentally rotate the objects in order to match the objects to one another. This process is called "mental chronometry".
Together with Paivio's memory research, this experiment was crucial for the importance of imagery within cognitive psychology, because it showed the similarity of imagery to the processes of perception. For a mental rotation of 40° the subjects needed two seconds in average, whereas for a 140° rotation the reaction time increased to four seconds. Therefore it can be concluded that people in general have a mental object rotation rate of 50° per second.
Although most research on mental models has focussed on text comprehension, researchers generally believe that mental models are perceptually based. Indeed, people have been found to use spatial frameworks like those created for texts to retrieve spatial information about observed scenes (Bryant, 1991). Thus, people create the same sorts of spatial memory representations no matter if they read about an environment or see it themselves.
Size and the visual field
If an object is observed from different distances, it is harder to perceive details if the object is far away because the objects fill only a small part of the visual field. Kosslyn made an experiment in 1973 in which he wanted to find out if this is also true for mental images, to show the similarity of the spatial representation and the perception of real environment. He told participants to imagine objects which are far away and objects which are near. After asking the participants about details, he supposed that details can be observed better if the object is near and fills the visual field. He also told the participants to imagine animals with different sizes near by another. For example an elephant and a rabbit. The elephant filled much more of the visual field than the rabbit and it turned out that the participants were able to answer questions about the elephant more rapidly than about the rabbit. After that the participants had to imagine the small animal besides an even smaller animal, like a fly. This time, the rabbit filled the bigger part of the visual field and again, questions about the bigger animal were answered faster. The result of Kosslyn's experiments is that people can observe more details of an object if it fills a bigger part of their mental visual field. This provides evidence that mental images are represented spatial.
Since the 1970s many experiments enriched the knowledge about imagery and memory to a great extend in the course of the two opposing point of views of the imagery debate. The seesaw of assumed support was marked of lots of smart ideas. The following section is an example of the potential of such controversities.
In 1978, Kossylyn expanded his image screening experiment from objects to real distances represented on maps. In the picture you see our island with all the places you encountered in this chapter. Try to imagine, how far away from each other they are. This is exactly the experiment performed by Kossylyn. Again, he predicted successfully a linear dependency between reaction time and spatial distance to support his model.
In the same year, Pylyshyn answered with what is called the "tacit-knowledge explanation", because he supposed that the participants include knowledge about the world without noticing it. The map is decomposed into nodes with edges in between. The increase of time, he thought, was caused by the different quantity of nodes visited until the goal node is reached.
Only four years later, Finke and Pinker published a counter model. Picture (1) shows a surface with four dots, which were presented to the subjects. After two seconds, it was replaced by picture (2), with an arrow on it. The subjects had to decide, if the arrow pointed at a former dot. The result was, that they reacted slower, if the arrow was farer away from a dot. Finke and Pinker concluded, that within two seconds, the distances can only be stored within a spatial representation of the surface.
To sum it up, it is commonly believed, that imagery and perception share certain features but also differs in some points. For example, perception is a bottom-up process that originates with an image on the retina, whereas imagery is a top-down mechanism which originates when activity is generated in higher visual centres without an actual stimulus. Another distinction can be made by saying that perception occurs automatically and remains relatively stable, whereas imagery needs effort and is fragile. But as psychological discussions failed to point out one right theory, now the debate is translocated to neuroscience, which methods had promising improvements throughout the last three decades.
Investigating the brain - a way to resolve the imagery debate?
Visual imagery was investigated by psychological studies relying solely on behavioural experiments until the late 1980s. By that time, research on the brain by electrophysiological measurements such as the event-related potential (ERP) and brain-imaging techniques (fMRI, PET) became possible. It was therefore hoped that neurological evidence how the brain responds to visual imagery would help to resolve the imagery debate.
We will see that many results from neuroscience support the theory that imagery and perception are closely connected and share the same physiological mechanisms. Nevertheless the contradictory phenomena of double dissociations between imagery and perception shows that the overlap is not perfect. A theory that tries to take into account all the neuropsychological results and gives an explanation for the dissociations will therefore be presented in the end of this section.
Brain imaging experiments in the 1990s confirmed the results which previous electrophysiological measurements had already made. Therein brain activity of participants was measured, using either PET or fMRI, both when they were creating visual images and when they were not creating images. These experiments showed that imagery creates activity in the striate cortex which is, being the primary visual receiving area, also active during visual perception. Figure 8 (not included yet due to copyright issues) shows how activity in the striate cortex increased both when a person perceived an object (“stimulus on”) and when the person created a visual image of it (“imagined stimulus”). Although the striate cortex has not become activated by imagery in all brain-imaging studies, most results indicate that it is activated when participants are asked to create detailed images.
Another approach to understand imagery has been made by studies of people with brain damage in order to determine if both imagery and perception are affected in the same way. Often, patients with perceptual problems also have problems in creating images like in the case of people having both lost the ability to see colour and to create colours through imagery. Another example is that of a patient with unilateral neglect, which is due to damage to the parietal lobes and causes that the patient ignores objects in one half of his visual field. By asking the patient to imagine himself standing at a place that is familiar to him and to describe the things he is seeing, it was found out that he did not only neglect the left side of his perceptions but also the left side of his mental images, as he could only name objects that were on the right hand side of his mental image.
The idea that mental imagery and perception share physiological mechanisms is thus supported by both brain imaging experiments with normal participants and effects of brain damage like in patients with unilateral neglect. However, also contradictory results have been observed, indicating that the underlying mechanisms of perception and imagery cannot be identical.
Double dissociation between imagery and perception
A double dissociation exists when a single dissociation (one function is present another is absent) can be demonstrated in one person and the complementary type of single dissociation can be demonstrated in another person. Regarding imagery and perception a double dissociation has been observed as there are both patients with normal perception but impaired imagery and patients with impaired perception but normal imagery. Accordingly, one patient with damage to his occipital and parietal lobes was able to recognise objects and draw accurate pictures of objects placed before him, but was unable to draw pictures from memory, which requires imagery. Contrary, another patient suffering from visual agnosia was unable to identify pictures of objects even though he could recognise parts of them. For example, he did not recognise a picture of an asparagus but labelled it as “rose twig with thorns”. On the other hand, he was able to draw very detailed pictures from memory which is a task depending on imagery.
As double dissociation usually suggests that two functions rely on different brain regions or physiological mechanisms, the described examples imply that imagery and perception do not share exactly the same physiological mechanisms. This of course conflicts with the evidence from brain imaging measurements and other cases of patients with brain damage mentioned above that showed a close connection between imagery and perception.
Interpretation of the neuropsychological results
A possible explanation for the paradox that on the one hand there is great evidence for parallels between perception and imagery but on the other hand the observed double dissociation conflicts with these results goes as follows. Mechanisms of imagery and perception overlap only partially so that the mechanisms responsible for imagery are mainly located in higher visual centres and the mechanisms underlying perception are located at both lower and higher centres (Figure 9, not included yet due to copyright issues). Accordingly, perception is regarded to constitute a bottom-up-processing that starts with an image in the retina and involves processing in the retina, the Lateral Geniculate Nucleus, the striate cortex and higher cortical areas. In contrast, imagery is said to start as a top-down process, as its activity is generated in higher visual centres without any actual stimulus, that is without an image on the retina. This theory provides explanations for both the patient with impaired perception but normal imagery and the patient with normal perception but impaired imagery. In the first case, the patient’s perceptual problems could be explained by damage to early processing in the cortex and his ability to still create images by the intactness of higher areas of the brain. Similarly, in the latter case, the patients impaired imagery could be caused by damage to higher-level areas whereas the lower centres would still be intact. Even though this explanation fits to several cases it does not fit to all cases. Consequently, further research hast to accomplish the task of developing an explanation that is able to explain the relation between perception and imagery sufficiently.
Imagery and memory
Besides the imagery debate, which is concerned with the question how we imagine for example objects, persons, situations and involve our senses in these mental pictures, questions concerning the memory are still untreated. In this part of the chapter about imagery we are dealing with the questions how images are encoded in the brain, and how they are recalled out of our memory. In search of answering these questions three major theories evolved. All of them explain the encoding and recalling processes different, and as usual validating experiments were realised for all these theories.
In search of answering these questions three major streams evolved. All of them try to explain the encoding and recalling processes differently and, as usual, validating experiments were realised in all streams.
The common-code theory
This view of memory and recall theories that images and words access semantic information in a single conceptual system that is neither word-like nor spatial-like. The model of common-code hypothesis that for example images and words both require analogous processing before accessing semantic information. So the semantic information of all sensational input is encoded in the same way. The consequence is that when you remember for example a situation where you were watching an apple falling down a tree, the visual information about the falling of the apple and the information about the sound, which appeared when the apple hit the ground, both are constructed on – the – fly in the specific brain regions (e.g. visual images in the visual cortex) out of one code stored in the brain. Another difference of this model is, that it claims images require less time than words for accessing the common conceptual system. Therefore images need less time to be discriminated, because they share a smaller set of possible alternatives than words. Apart from that words have to be picked out of a much grater set of ambiguous possibilities in the mental dictionary. The heaviest point of criticism on this model is, that it does not declare where this common code is stored at the end.
The abstract-propositional theory
This theory rejects any notion of the distinction between verbal and non - verbal modes of representation, but instead describes representations of experience or knowledge in terms of an abstract set of relations and states, in other words propositions. This theory postulates that the recall of images is better if the one who is recalling the image has some connection to the meaning of the image which is recalled. For example if you are looking at an abstract picture on which a bunch of lines is drawn, which you cannot combine in a meaningful way with each other, the recall process of this picture will be very hard (if not impossible). As reason for this it is assumed, that there is no connection to propositions, which can describe some part of the picture, and no connection to a propositional network, which reconstructs parts of the picture. The other case is, that you look at a picture with some lines in it, which you can combine in a meaningful way with each other. The recall process should be successful, because in this case you can scan for a proposition which has at least one attribute with the meaning of the image you recognised. Then this proposition returns the information which is necessary to recall it.
The dual-code theory
Unlike the common – code and the abstract - propositional approach, this model postulates that words and images are represented in functionally distinct verbal and non - verbal memory systems. To establish this model, Roland and Fridberg (1985) had run an experiment, in which the subjects had either to imagine a mnemonic or how they walk the way to their home through their neighbourhoods. While the subjects did one of this tasks, their brain was scanned with the positron emission tomography (PET). Figure 10 is a picture combining the brains of the subjects, which achieved the first and the second task.
As we can see on the picture, for the processing of verbal and spatial information different brain areas are involved. The brain areas, which were active during the walking home task, are the same areas which are active during the visual perception and the information processing. And among those areas which showed activity while the mnemonic task was carried out, the Broca-centre is included, where normally language processing is located. This can be considered as an evidence for both representation types to be somehow connected with the modalities, as Paivio’s theory about dual-coding suggests Anderson (1996). Can you imagine other examples, which argue for the dual-code theory? For example, you walk along the beach in the evening, there are some beach bars ahead. You order a drink, and next to you, you see a person, which seems to be familiar to you. While you drink your drink, you try to remember the name of this person, but you fail stranded, even if you can remember where you have seen the person the last time, and perhaps what you have talked about in that situation. Now imagine another situation. You walk through the city, and you pass some coffee bars, out of one of them you hear a song. You are sure that you know that song, but you cannot remember the name of the interpreter, nor the name of the song either where you have heard it. Both examples can be interpreted as indicators for the assumption, that in these situations you can recall the information which you perceived in the past, but you fail in remembering the propositions you connected to them.
In this area of research there are of course other unanswered questions, for example why we cannot imagine smell, how the recall processes are performed or where the storage of images is located. The imagery debate is still going on, and ultimate evidence showing which of the models explains the connection between imagery and memory are missing. For now the dual-code theory seems to be the most promising model.
Anderson, John R. (1996). Kognitive Psychlogie: eine Einfuehrung. Heidelberg: Spektrum Akademischer Verlag.
Bryant, D. J., B. Tversky, et al. (1992). ”Internal and External Spatial Frameworks for Representing Described Scenes.” Jornal of Memory and Language 31: 74-98.
Coucelis, H., Golledge, R., and Tobler, W. (1987). Exploring the anchor- point hypothesis of spatial cognition. Journal of Environmental Psychol- ogy, 7, 99-122.
E.Bruce Goldstein, Cognitive Psychology, Connecting Mind, Research, and Everyday Experience (2005) - ISBN: 0-534-57732-6.
Marmor, G.S. and Zaback, L.A. (1976). Mental Rotation in the blind: Does mental rotation depend on visual imagery?. Journal of Experimental Psychology: Human Perception and Performance, 2, 515-521.
Roland, P. E. & Fridberg, L. (1985). Localization of critical areas activated by thinking. Journal of Neurophysiology, 53, 1219 – 1243.
Paivio, A. (1986). Mental representation: A dual-coding approach. New York: Oxford University Press.
Links & Further Reading
Cherney, Leora (2001): Right Hemisphere Brain Damage
Grodzinsky, Yosef (2000): The neurology of syntax: Language use without Broca’s area.
Mueller, H. M., King, J. W. & Kutas, M. (1997). Event-related potentials elicited by spoken relative clauses; Cognitive Brain Research 4:193-203.
Mueller, H.M. & Kutas, M. (1996). What’s in a name? Electrophysiological differences between spoken nouns, proper names and one’s own name; NeuroReport 8:221-225.
Revised in July 2007 by: Alexander Blum (Spatial Representation, Discussion of the Imagery Debate, Images), Daniel Elport (Propositional Representation), Alexander Lelais (Imagery and Memory), Sarah Mueller (Neuropsychological approach), Michael Rausch (Introduction, Publishing)
Authors of the first version (2006): Wendy Wilutzky, Till Becker, Patrick Ehrenbrink (Propositional Representation), Mayumi Koguchi, Da Shengh Zhang (Spatial Representation, Intro, Debate).