Cognition and Instruction/Long-Term Memory

From Wikibooks, open books for an open world
Jump to navigation Jump to search

When a student studies for tests and memorizes class material, where does the information go? Long-term memory remains absolutely necessary and important in learning, as all information that a student learns is remembered, or stored in either short- or long-term memory. While both short-term memory and long-term memory remain important for storage purposes, they can also influence people's learning, how they perceive things, and how they build up the meaning in what they perceive. Learning and memory constantly influence one another, as one's memories or prior knowledge of certain concepts, subjects, or items can enhance learning. In this chapter, we will describe the components, functions, and framework of long-term memory based largely on the widely accepted information processing model. We will also link this framework to cognition, exploring the many ways in which information reaches long-term memory and is stored and retrieved. Lastly, we will discuss newer and established models which describe other views on long-term memory.

Overall structure and functions of long-term memory[edit | edit source]

Long-term memory has the supposedly limitless and permanent capacity for all sorts of information that one experiences within a whole lifetime. Having long-term memory is necessary for all learners. Understanding how it works, its makeup, and processes within it can help learners to better understand their own learning.

Our long-term memory contains vast amounts of information comprised over long periods of time, and unlike short-term memory (discussed in a different chapter), does not require constant repetition to make it last. Information stored in the LTM is recalled or reconstructed, rather than rehearsed or repeated. Importantly, LTM is often broken down into categories of knowledge which include declarative knowledge, procedural knowledge, and conditional knowledge.

Declarative knowledge or memory (also sometimes referred to as semantic knowledge) refers to knowledge that we typically can explicitly articulate, whereas procedural knowledge usually refers to implicit skills and processes that we have no little or no trouble performing but find it difficult to express explicitly (the later sections on production rules and the ACT-R theory explain these in more detail). Conditional knowledge means knowing in what kinds of conditions or situations to deploy declarative and procedural knowledge. The table beolw shows some concrete examples of each. 

Declarative knowledge Procedural knowledge  Conditional Knowledge 
Mobile phone is a portable telephone. How to make phone calls. When to pick up the phone and hang up.
Cars usually have four wheels. How to drive a car. When to lock the seatbelt and unlock it.

Table 1. Examples of declarative knowledge, procedural knowledge, and conditional knowledge

Building blocks of cognition[edit | edit source]

The “building blocks of cognition” are five mental constructs hypothesized by many theorists that work together to form the foundation of all of the mental frameworks and information that is stored in the long-term memory.[1] Essentially, they are the components of LTM. Although many of these components may share similar features, each is slightly different than the next. The first three concepts that we will examine are linked closely to declarative knowledge, and the last two are usually considered parts of procedural knowledge.[1]

Concepts[edit | edit source]

What are Concepts? Concepts are theorized to be ways in which we break down and categorize mental structures into relatively elemental chunks and groupings with meaning that then can be used to make sense of any new incoming information.[1] They are deemed to be “conceptually coherent chunks of knowledge” that can be triggered and called upon when one is prompted to retrieve information, and they are usually categorized as declarative knowledge.[2] For example, when talking about a concept of cats, you might refer to a category of animals that share similarities with one another: they are all small and furry; they use “meowing” to communicate. Cats may have different hair colors: white, black, brown, and so on; they may be domestic cats or feral cats. However, they all belong to the category of cats.

Concepts that are based off highly common/prominent events are called prototypes.[1] For instance, the best representative or the prototype of a basketball league of North America might be the National Basketball Association. It is believed that concepts, along with the other four components of the “building blocks of cognition”, work together to formulate the foundations of what we know to be long-term memory, supporting the acquisition and development of language functions, factual knowledge, and object recognition--many of the very core aspects of long-term memory.[3]

What are Concepts composed of? There are two main theories that are considered with regards to how conceptual development occurs.[3] First, some theorists believe that concepts are abstract, mental structures in the brain, which are formed separately from the sensory-motor systems from which the information in these structures is received.[3] In contrast, the other main theory, which has been supported by neuroimaging technologies (such as fMRI), is that concepts are formulated in accordance with the sensory-motor component and that they are stored within long-term memory as multi-modal structures.[3]

There are three widely agreed upon categories of which we sort our conceptual information into; matter, processes and mental states.[4] The idea of processes means that we store mental information pertaining to a series of interrelated events that occur of which we would expect to see a particular result.[4] An example of processes could be dropping something from any height--the forces of gravity will not allow for it to be suspended in space and will act upon it to bring it back down to the earth. Mental states refers to a category essentially designated for internal states and emotions, such as recognizing when you feel upset, happy, or unsure about something.

How do we formulate Concepts? There are three established ways pertaining to how we develop and formulate our concepts. First is the conservative focusing strategy proposed by Bruner, Goodnow, and Austin.[5] Individuals who use this strategy are able to select appropriate stimuli according to the relevant attributes surrounding the concept of which they are confronted with. Others favour the focus gambling strategy, where it is believed we gain all of the knowledge we need about a stimulus at a single period of time, all at once.[1] Individuals who choose to follow this strategy will, in fact, take less time to attribute a stimulus than those who chose conservative focusing strategy. However, they will be more likely to make mistakes as they are making their attributions out of speed, not thoroughness.[1] The final possible strategy one can utilize is called scanning strategies, where individuals will attempt to put multiple hypotheses to the test at one given time.[1] Although this is also a time-efficient strategy for attributing stimuli, the testing of these multiple hypotheses is ultimately a greater cognitive demand than testing one at a time, and thus can detrimentally impact an individual’s abilities to process and remember information.[1]

Propositions[edit | edit source]

Propositions are the mental concepts in which most theorists widely believe that we store linguistic information and the majority of our declarative knowledge.[1] Propositions are known to be the absolute shortest statement to which meaning can be attached, yet are inherently more complex than concepts as they build upon the preexisting concepts in order to form meaningful statements and assertions how these particular concepts are related.[6] In order to be a proposition, the statement made must be able to be judged to be either true or false (in other words, a declarative statement of knowledge).[7] Here is an example of a sentence that contains two propositions: “Luke bought the expired ticket.”

Figure 1. An example of propositional network

1- Luke bought the ticket. (The event happened in the past.)

2- The ticket had expired.

It is believed that propositions sharing common characteristics or qualities are linked together within propositional networks, which can be activated through the encoding or retrieval of information related to a specific proposition.[1] If we apply the same propositional network in a new sentence: “Luke bought the ticket which had expired”, we will find that the two sentences have the same meaning. An image representing this idea can be found on the right.

Schemata[edit | edit source]

What are schemata? Schemata are believed to be mental representations of an individual’s general cause and effect knowledge.[8] Any and all knowledge that we gain is organized in the schema, which is responsible for the subsequent encoding, storage and retrieval of information.[1] Schemata are formed through the interaction of the external conditions and the individual’s own prior knowledge.[9] The image on the right can be a representation of schemata for knowledge about mammals.

Figure 2. Schemata: Knowledge about mammals

The schemata have been compared to the mental equivalent of scaffolding. In other words, the schemata that we form will provide supports for us when we find ourselves in novel situations or learning new information.[10]

How are schemata formed? Possessing pre-existing schematic knowledge on a certain topic has been linked to improved memory on retaining new information when attempting to recall newly encoded information.[11] This is believed to occur as it allows for new information to be more rapidly assimilated into the brain (and thus into the activated schema).[11] The information that is encoded in our schema is sorted into what are known as slots; specific mental “categories” of sorts, into which our knowledge is encoded, stored, retrieved and ultimately how it is perceived overall.[1] When a schema has developed and has been proven to be a common occurrence of events or concepts, it will then likely become a part of our long-term memory where it will continue to serve as the foundation for our recollections and any future schematic information that may be encoded.[1] This process is termed schematic instantiation.[12]

Productions[edit | edit source]

Productions are “if-then” statements that serve as a set of action rules, which govern all of our procedural knowledge.[13] Here is an example of if-then productions: “If the traffic light turns from green to yellow, then slow down”. The productions are instantaneous, automatic mental concepts that are learned to be second nature to humans after repetitive exposure to a common sequence of events.[1] They provide a set of production rules and expectations for these events, and, like propositions, are organized in interactive groups known as production networks.[1] Often by activating one production, other productions will be triggered, reacting in a series of cognitive processes and actions until the ultimate goal is accomplished.[1] A later section offers a more detailed discussion of production and production rules as a theory of memory.

Scripts[edit | edit source]

Figure 3. A child's script for a hotel check-in

Scripts are the mental concepts that work as the underlying framework for all our procedural knowledge.[1] It is commonly agreed that scripts are vital to our social understanding of the world around us, and largely work to provide information governing social situations and events, specifically who does what, when do they do it, to whom they do it and why.[14] People use scripts in many kinds of events such as checking into hotels. Scripts develop over time and with continuous exposure to recurring events that are all essentially similar in nature.[1] For instance, you might develop your own script of how to check into a hotel over time, and it can help you to organize, remember things, as well as react to the possible upcoming events in the situation. The figure to the right is a child’s script for a hotel check-in.

Implications of these building blocks for instruction[edit | edit source]

It is incredibly important for all educators (currently employed and future alike) to ensure that they are knowledgeable about each individual component of the building blocks of cognition, and how all of these mental concepts work together to facilitate learning, acclimation of knowledge and development, in addition to retrieval and the retrieval processes. By doing so, they can ensure that all of their students are fully utilizing these mental processes (such as by teaching “review lessons” prior to the new curriculum in order to activate previous productions, schemata propositions to facilitate the encoding of the new information, as well as prepping for an easier retrieval later on) in order to reap all of the benefits out of their education. By obtaining knowledge about the inner workings of these mental processes, educators will be able to better understand how learning occurs and how best to assist their students while encoding novel stimuli and information.

Encoding: How information reaches long term memory and how it is stored and retrieved[edit | edit source]

This section is a brief discussion of aspects of encoding that pertain to long-term memory. For a detailed discussion on encoding, please see the next chapter.

Information reaching long-term memory: The modal model[edit | edit source]

Figure 4. A depiction of modal model

The modal model is one of the most widely accepted models that describe how information is perceived from the environment and travels through a series of cognitive functions before it reaches the LTM. It is a general depiction that recent research has put together of the sequence in which information is transferred from our senses to the short-term memory, ending with long-term memory. Based on this model, information is assumed to be processed through each of the three “lower” memory systems, each its own separate function.[1] This model provides a significant distinction between each of the different memory functions, and the processes between each (more details of this model are discussed in a later section outlining different theories of memory).

Storing information[edit | edit source]

Encoding is the process of transferring information from the working memory into the long-term memory, and is highly important due to its significance towards how well something is remembered. Below are some of the different encoding and processing methods that are well-known and well-used.

Rehearsal[edit | edit source]

Referring back to the modal model, rehearsal is the process in which information is kept in the short-term memory, usually through constant repetition. Maintenance rehearsal usually employs the process of constant repetition and recycling information (also known as rote memorization), but it is considered a more shallow method of encoding as the information is usually kept active for only a short amount of time, and decays quite rapidly once repetition is ceased. Elaborative rehearsal is a more meaningful mode of encoding, in which to-be-learned information is given meaning by being related to previously learned information. Though this form of rehearsal uses more cognitive resources, it is better for long-term retention and makes use of deeper encoding activities.[15]

Elaboration[edit | edit source]

Several elaborative encoding strategies exist, all which make new information easier to process or remember. One well-known and most-used elaborative encoding strategy is the mnemonic, a process which engages more sophisticated coding by pairing together new information with well-known information. This strategy typically makes use of rhymes, hand gestures, acronyms, and many others.[1] For example, you could use the acronym “SEG” to remember your shopping list of steak, eggs and garlic. Other strategies include mediation, a simple strategy of connecting a new piece of information to something more meaningful, and imagery, which involves tying together a corresponding image to something to be remembered.[1] You can also use your imagination of these things that relates to a familiar place such as your house. Imagine a strong smell of garlic when opening your living room door, a box of cracked eggs next to the door, and a piece of juicy steak on the dining table. Using this strategy, you can remember the items by taking an imaginary walk from your living room to your dining room.

Associated theories[edit | edit source]

Levels of processing theory[edit | edit source]

Influential constructivist views, especially theories from Craik and Lockhart,[16] remain significant to this day. Their levels of processing theory is most reputable. According to this theory, students benefit most from performing cognitive analyses on the to-be-learned information—memory of the information is retained naturally after these processes. However, the retention of the information is highly based on the methods in which it was processed. According to theory, the more deeply the information is processed and the more meaning is given to the information, the better it is retained, while shallower processing of more superficial details tends to make the information forgotten much faster.[1] It is theorized and widely proven that participation in more meaningful, rather than mundane tasks, helps students to better remember the information learned. Providing students agency and choice are also beneficial towards retention, as studies done by Jacoby and many others show how having students make decisions (especially difficult ones) recall more of the task than if they made simpler decisions, or none at all.[1]

Dual-Coding Theory[edit | edit source]

This theory, proposed by Allan Paivio,[17] argues that knowledge is held in long-term memory either visually or verbally, or both. This is supported by some scholars and psychologists, who agree that when information is processed and stored both in image as well as verbal forms it is mostly easily remembered.[18] For an educational implication based on this theory, it may be helpful to teach students by offering, for instance, a graphic display of a human brain along with textual information when learning about the brain’s features and regions. This theory shares some foundation with Richard Mayer’s Cognitive Theory of Multimedia Learning,[19] which will be discussed next.

Cognitive theory of multimedia learning[edit | edit source]

Richard Mayer[19] has been exploring combinations of images and words, finding that appropriate ones can deliver the most effective instruction, especially for older students. This theory is based on three tenets: a) ideas from the Dual-Coding Theory;[17] b) the notion that the working memory has very limited capacity for storing imagery and verbal information, meaning instructions should be presented in a way that optimizes the amount of cognitive load put on students’ working memory systems, which is also referred to as the Cognitive Load Theory;[20] and c) the notion that learning entails organizing and integrating information.[21]

Information retrieval[edit | edit source]

Spread of activation[edit | edit source]

Since there can be a vast amount of information stored in long-term memory. Retrieving or recalling the right piece of information at the right moment may be difficult at times. It happens through a process known as spreading activations, which means that when one piece of knowledge is currently on our mind, other related pieces of information can be activated as well, through the interconnected network of information in our long-term memory.[22] For example, if Ben is thinking, “how wonderful it would be if it stopped raining right now”, this might then trigger the thought of needing to check the weather forecast to see if rain would affect his field trip a week later, which could then remind him of contacting his travel mates to pick him up on that day.

Reconstruction[edit | edit source]

Because certain pieces of memory, such as events that happened a long time ago, may be difficult to recall, our cognitive system might use any relevant clues we can remember and reconstruct these pieces of memory through logic, which might produce memories that are not identical to the exact occurrences but are logical and reasonable.[23] For instance, if we went for a picnic near a lake with friends 10 years ago, we might be able to recall the trip but not remember the purpose of it, and we might say that it was a hiking trip around the lake instead, which shares some similarities as the original event but is not identical.

Forgetting[edit | edit source]

If information is not accessed for a long time, we may eventually no longer be able to retrieve it. This could happen through either decay or interference, which mean weakening of the information signal and having other conflicting information interfere with the piece of memory that we are trying to recall, respectively.[24] For example, we might no longer remember what T-shirt we wore at a concert because it has been a few years since then, or if we think it was a blue one but a friend recently mentioned that it was in fact green. One neurological explanation of this is that our brain cells and the connections between them can become weak and even die if we do not use them enough.[22]

Despite the processes of decay and interference, knowledge can be stored in the long-term memory for extremely long periods of time, especially with appropriate kinds of prompts and other ways of remembering information.[25] These include techniques mentioned earlier, such as using mnemonics and elaboration.

Expertise and automaticity of skills[edit | edit source]

Explicit or declarative knowledge can be acquired and built through many processes such as instruction, experience, and adopting cognitive strategies to remember information (mentioned earlier).

Some scholars have argued that declarative knowledge can be transformed into procedural knowledge as one becomes more skillful at a task with practice and experience, essentially meaning that the deployment of explicit knowledge becomes so automatic that it turns into an implicit skill.[22][13] For example, when we try to wrap a gift for the first time, we might try to articulate each step of the process, such as: find a piece of wrapping paper of the right size for the gift; wrap it around the gift; cut the excess paper; use tape to secure the wrapping. These steps become automatic as we perform the task over and over, essentially eliminating the need to give extensive consideration to each step individually. More detailed examples of this can be found in the later section on the ACT-R theory.

Long-term memory and learning: Fostering higher encoding processes[edit | edit source]

Higher encoding processes are typically activated when one encodes more complex information, and higher encoding processes usually help more towards higher educational/learner goals.[1] Instructors should try to foster such processes. As shown earlier, students tend to perform much better the more elaborately they encode the to-be-learned information. Through methods such as activating prior knowledge and guided peer questioning, instructors can activate relevant schemata in students and provide opportunities for comprehension and asking thought-provoking questions. Activating prior knowledge helps to prepare learners for new learning activities: a base of already-known information can help to guide the new to-be-learned information.[1] For retention, instructors can encourage students to practice certain tasks until they gain automaticity.[13][1] As much as possible, instructors should involve students more in their learning to encourage active, rather than passive learning.

The functions of long-term memory: Assessment and research[edit | edit source]

Memories gathered over a longer period of time have a greater chance of being retained long-term, but the quality of the memory is just as important as quantity. Quality can refer to sensory information being gathered by the individual during the experience, like smelling popcorn at the movie theater, and can have a bidirectional relationship between quality components, like smelling popcorn and thinking of the movies or being at the movies and remembering the taste of popcorn.

The majority of research done in this field focuses on self-evaluation or individual memory testing, both of which have fair parameters of error, though functional magnetic resonance imaging devices have been used to noninvasively view the activity of an individual’s brain. An experiment was done using this technique by Anderson, Fincham, Qin, & Stocco[26] to find the link between procedural execution, goal setting, controlled retrieval from declarative memory and image representation construct, and the brain’s cortical regions. The findings of this experiment showed that each of these four areas lit up a different cortical region on the imaging device. This evidence seems to show that different areas of the brain handle these different areas, but critiques on the technique highlight that we still do not know why this activity occurs and what connections are being formed in the mind to cause the array of activity. Despite limitations, experiments of this variety do give us greater insight into our brain activity than we previously had, and show just how different information can stimulate different areas of the brain, so we know that they are not all active all the time.

Other changing and growing theories of memory[edit | edit source]

Network models[edit | edit source]

Figure 5. An example of a network model

Network Models could be compared to mind mapping or a brain-storming web as information is represented by a web-like pattern, generally moving from the general to more specific information or categories. This would be similar to the way in which a small child slowly develops the ability to differentiate between different animals that have four legs and fur, learning that a dog and cat have different classifications. Networking models are one of the more simple ways to organize small units of information when they related within the topic to other pieces. This model has been used directly in teaching--“Mind mapping directed the students’ attention to plan, monitor, and evaluate their learning processes, which helped them to obtain metacognitive knowledge and transfer their understanding to solve novel problems and situations.”[27]

The Connectionist Model[edit | edit source]

A general model of what a Connectionist Model might look like
Figure 6. A general model of what a Connectionist Model might look like

The Connectionist Model is a ‘brain metaphor’ taking on the traditional computer metaphor used for information processing, storage, and retrieval model;[1] it is also referred to as the parallel distributed processing model.[1] This model includes the concept of understanding based on context; an example of this would be having a shape with a straight line on the left, with a ‘3’ shape on the right. In the series ‘12 |3 14’ this would be seen as the number thirteen, but in the sequence ‘A |3 C’ it can be read as the letter ‘B’. It is because of the adaptability to context and ability to combined cognitive tasks with a physical attribution that the connectionist model was developed to better encompass these dynamics. This theory looks at the human thought processes from a multitude of parallels as the human brain is able to consider multiple thought directions in a time and in a way that a computer wouldn’t think to compare or connect. As mentioned previously, other models have a store-retrieval aspect of recovering information where the pattern of information connections is stored and recovered when needed. Alternatively, the Connectionist Model theorizes that the elements of the pattern or connections are stored as the strengths of their connections, to be retrieved and reconnected.[28] On this topic, Vickers and Lee had an important point: “ connectionist accounts of semantic or meaningful information are based on conceiving of meaning as activation of a limited number of features, at least at the input layer.”[29] This means that this theory works best if the information has depth over just memorizing facts.

Production-rule-related theories of memory[edit | edit source]

In the study of the human cognitive system, productions (or sometimes referred to as production rules) are rules for reaching a particular goal or solving a problem. They are commonly considered components in our long-term mermory (see the Productions in Cognitive Psychology section below). Essentially, each production can be considered one single guiding step in the thinking process. It can commonly be represented as a prescription of what actions to take in what kinds of conditions – a “condition-action” or “if-then” sequence.[13][30] For instance, a production within the overarching goal of frying an egg could be depicted as:

IF the goal is to fry an egg,

and the raw egg has been removed from its shell,

and the pan has been heated to reach the right temperature,

THEN place the raw egg in the pan,

In this situation, the production guides the course of action depending upon the condition(s). Once the conditions have been met (the egg has been removed from its shell; the pan has reached the right temperature, etc.), the rule becomes applicable and the action (putting the egg in the pan) is performed.

Key features[edit | edit source]

Important features of productions include that, as mentioned previously, each production can be thought of as one rule or step, and the learning of which can happen separately from acquiring other productions.[13] Also, due to this nature, when an elaborate and complex skills or cognitive function/process is acquired, it likely means that the entire series of productions that constitutes the skill is learned – connected subgoals are strung together to achieve an overarching goal.[13] For instance, in the egg frying example, preceding the cooking process could be another task such as locating the nearest grocery store and going there to buy eggs, which is a subgoal in itself in the overall goal of cooking the egg. Of course, the number of productions in a process depends on its complexity.

Another important feature is that production rules are abstract in nature and can apply across different task situations of similar nature.[13] For example, the aforementioned productions for frying the egg could also be applied to frying vegetables, which would involve the same contingency on the condition of the pan being hot enough and then the procedure of putting the vegetables in the pan.

In addition, productions can be specific to a domain of practice, such as within algebra in mathematics, or relatively general, such as pertaining using a vacuum cleaner.[31]

Productions in cognitive psychology[edit | edit source]

Typically, in cognitive psychology, a dichotomy of declarative knowledge (or declarative memory) versus procedural knowledge is used to distinguish between the types of knowledge, experience, or skill that we all possess in long-term memory. Declarative knowledge refers to ideas or propositions that can be explicitly stated or articulated whereas procedural knowledge simply refers to skills or actions that can be performed to achieve a goal. Procedural knowledge is often difficult to express in words. In this sense, this dichotomy of declarative versus procedural can also be referred to as explicit versus implicit knowledge or memory.

With this context in mind, productions usually fall under the implicit, procedural knowledge category. In fact, production rules are often described as the contents of procedural knowledge or as the “embodiment of the skill”,[13] because they are individual steps for guiding a course of action or cognition. Essentially, in simpler terms, productions are about “how to do things”,[24] which is what procedural knowledge is about.

In general, with practice and more experience, a skill becomes more automatized, meaning that the productions that constitute the skill fire faster and more consistently. As this happens, the performer becomes less conscious of each individual production and gradually comes to perceive the sequence of firing productions as a single fluid action.[32]

Evidence for production rules[edit | edit source]

In making the argument that production rules are psychologically real, Anderson asserts that the first piece of evidence is that production rules are apt at describing multiple aspects of skills and cognitive tasks in progress.[13] That is, they provide a logical and plausible explanation of how tasks are performed. Another significant piece of evidence Anderson cites is that using production rules we are able to predict aspects of one’s behavior as a skill or task is being performed.[13] For example, when we observe the condition of a pan becoming hot, we can then expect to see him/her putting food into it (the conditional action).

The ACT-R Model: A model of cognition and long-term memory based on production rules[edit | edit source]

A highly prominent theory which reflects the application of production rules is the Adaptive Character of Thought-Rational (ACT-R) theory, John Anderson's theory of human cognition that uses production rules as the building blocks of cognitive processes. The central argument posited by the ACT-R theory is that a complex cognitive skill comprises a large number of individual “units of goal-related knowledge”. [33]

History of ACT-R[edit | edit source]

The theory originally stemmed from the Human Associative Memory (HAM) theory (one of the creators of this theory was also John Anderson, the creator of the ACT-R theory) that explained certain aspects of human memory and knowledge. It involved the notion of declarative knowledge but did not deal with procedural knowledge.[34]

Using that as a foundation, John Anderson then proposed that procedural knowledge consists of production rules. After fine-tuning a few variants of his theory, he established the original ACT theory in 1983, which was aimed at explaining a wide range of cognitive processes.[13]

Subsequently, after taking into account more evidence and emerging data on cognitive skills, Anderson believed that an element of rational analysis in the cognitive process should be integrated with the somewhat “mechanistic” nature of the original ACT theory,[13] and therefore he created the ACT-R (R for rational) theory,[13] which he felt was an improvement over the original due to its greater adaptive and selective ability towards the environment.[30][13]

The theory’s initial focus was on human memory and cognition. Much of its most prominent application and development has been in computer model tutors (intelligent tutors). Briefly, these are computer software which can guide learners/students in problem solving by referring to production-rule-based models that generate solutions to such problems.[33] These computer tutors are mainly developed and used in domains such as mathematics and science. More details of ACT-R’s applications as tutoring systems will be discussed in a different chapter.

Key tenets of ACT-R[edit | edit source]

There are three fundamental ideas that frame the theory: a) the representation of these knowledge units; b) their acquisition; c) their deployment in cognitive processes.[34] These are discussed below.

Representation of knowledge[edit | edit source]

One central tenet of the theory is that cognition involves both the element of declarative knowledge (which is propositional, semantic knowledge, as mentioned earlier) and procedural knowledge (which is represented as production rules), and that the two work closely together in cognitive processing.[34]

Declarative knowledge is encoded, stored, and represented in chunks, or individual units of human memory that resemble schemas (knowledge structures). Each chunk contains propositions or descriptive features about the subject item, stored in slots,[30] including what larger category it belongs to. Chunks can be represented either in a textual or graphical format.[34] For example, a chunk about frying an egg might be textually represented as: frying an egg is a type of cooking skill; requires the egg to be removed from its shell prior to cooking; requires heat. To the right is a possible graphical representation of a chunk.

Figure 7. Graphical representation of a chunk

On the other hand, procedural knowledge is represented as a set of productions.[34] As previously discussed, the productions can take the form of an interconnected series of subgoals that are aimed at reaching an overarching goal.

The relationship between the two types of knowledge is that the chunks of declarative knowledge structures provide the conditions and courses of action necessary for productions to happen. For instance, in order to cook eggs, one must possess the knowledge chunks for buying them, removing them from their shells, and preparing the pan, etc. Without one of these, there will be a gap in the procedural knowledge. Therefore, having more chunks of knowledge implies more available production rules and better procedural knowledge. It can be also considered that declarative knowledge can be transformed into procedural knowledge, as will be discussed in the next section.

Acquisition of knowledge[edit | edit source]

Declarative knowledge is acquired in a fairly straightforward way, either from the perception of information or ideas from the environment[34] or directly from instruction (being given information).[33] Since declarative chunks of knowledge are required in order to inform productions, this tenet of the ACT-R theory implies that having the knowledge to perform a cognitive or procedural task basically entails gathering all individual chunks of information that the task needs – the task is a “sum of its parts”.[34] Therefore, complex tasks require the collection of many chunks.

The acquisition of production rules in procedural knowledge, on the other hand, is slightly more difficult and less straightforward, since they cannot simply be told or articulated. Essentially, they are learned only as declarative knowledge is deployed. This means learners acquire production rules when they do tasks, not simply when they are given declarative information. It is key to note that this deployment can only occur in the appropriate contexts and conditions for the productions to take place. When the conditions for performing a task are appropriate, goal-oriented cognitive activities can take place, in which declarative chunks are put into action (or “executed”) in succession. In this way, it can be considered that they are essentially converted into production rules to guide the person’s actions towards the goal.[33] With practice, this process of conversion can be improved or strengthened in terms of speed and accuracy.[33] Thus, providing opportunities for practice and feedback is one highly conducive way to foster the acquisition of production rules.[33]

Deployment context of knowledge[edit | edit source]

This aspect concerns ACT-R’s explanation of how our cognitive structure is able to summon the right type of knowledge for a certain context of task or problem-solving. This is the function of rational analysis – the “R” part of the theory’s name. The process of rational analysis identifies two elements in order to determine the right chunks and production rules to be activated in the mind: a) the chances that such knowledge has worked well in such a situation in the past; b) the chances that such knowledge is likely to work well in the situation at hand. Combining these two factors, this selective process recognizes the likelihood of a piece of knowledge being appropriate and applicable in a given task context.[34]

In essence, this also implies that the human cognitive system maintains a record of what kinds of knowledge have been appropriate in what kinds of tasks, although this is likely to be a subconscious process in the mind. Thus, the theory’s explanation of this aspect basically describes a statistical process.[34]

Summary of ACT-R’s theoretical aspects[edit | edit source]

In short, according to the ACT-R theory, declarative knowledge is encoded by perception of information in one’s surrounding environment (including the instructions that a student receives from teachers, parents, and peers, etc.); procedural knowledge is developed as a result of learning to deploy such declarative knowledge (often many units of it in succession) in the context of performing a tasks or solving a problem; and the selection of the right type of knowledge to deploy in a given situation happens according to the cognitive system’s estimate of how likely a piece of knowledge would be useful and appropriate.

Applications and empirical evidence for the ACT-R theory[edit | edit source]

Among a number of ACT-R’s applications thus far, a pair of experiments conducted by Anderson and his colleagues yields significant empirical evidence that supports the theory. These experiments studied how university undergraduates worked out most efficient routes from starting points through various mid-way points to final destinations on a map (of the city of Pittsburgh, Pennsylvania), taking into account factors such as cost and time.[13] The experiments were done by monitoring students as they looked at the map on a computer screen and clicked on the locations/mid-way points they wanted to move to or pass through in order to reach the final destinations.

The findings from the subjects were then compared to the “thinking processes” and solutions produced by computer models based on the ACT-R theory (using sets of production rules) to solve the same navigation problems. The table below[13] shows some examples of the model’s production rules used to determine routes in this navigation task:

IF the goal is to find a route from location1 to location2,

and there is a route to location3, and location3 is closer to location2,

THEN take the route to location3,

and plan further from there.

IF the goal is to find a route from location1 to location2,

and there is a route from location1 to location2,

THEN take that route.

Table 2. Examples of the production rules to determine routes in the navigation task

The ACT-R model’s way of thinking was compared to that of the undergraduate students one step at a time – each single choice of route (each single production) made by the model was put alongside each choice made by each student (the mid-way points they clicked on). The results showed that the ACT-R model’s route decisions matched those of the students “67% of the time”,[13] and even if they did not, they matched students’ second or third top choices, closely paralleling the way human subjects behaved cognitively.[13]

In addition, another important finding was that the computer model’s latency (number of seconds taken) in making route choices was very similar to the decision paces of the subjects. Even though this finding was a relatively general correlation, it likely supports the ACT-R theory’s ideas regarding time required by the human cognitive system to make judgments based on production rules in performing cognitive tasks (consider and evaluate different route choices before making decisions).[13]

The final result which is consistent with the ACT-R theory was that, with practice over the span of the experiments (about one week), the human subjects improved in their speed of optimizing route planning, likely supporting the principle of strengthened production rules in improved task performance.[13]

Anderson notes the importance of this map navigation activity and its evidence in supporting the ACT-R theory,[13] arguing that it involves a real-life task in which people need to consider real factors and consequences such as cost and time, as opposed to basing the experiments on abstract, academic problems such as mathematical ones, where there is little implication related to true situations. In addition, such a task of finding different routes to reach destinations involves more than one solution, meaning that solutions are of different degrees of success, which makes this a more realistic test of whether the ACT-R model is true to the human cognitive system.

Aside from these experiments, other highly significant empirical support for the ACT-R theory includes the work that has been invested in Intelligent Tutoring Systems (ITS), which is explored in detail by scholars such as Ritter and his colleagues.[35] Although not within the scope of this chapter, further discussions regarding computer tutoring systems will be carried out at length in a different chapter.

Instructional implications of the ACT-R theory[edit | edit source]

Based on the aforementioned tenets and features of the theory, Anderson et al.[33] provide a list of principles for designing tutoring systems, some of which may also be applicable to instructional design in general, including ideas such as: a) representing a skill as a set of productions;b) clarifying subgoals (productions) in solving a problem; c) provide instruction specific to certain problem contexts while also promoting transferable production rules; d) focus only on necessary production rules to reduce memory load; e) provide instruction of appropriate granularity depending on how fine-grained production rules need to be in a task; f) retracting instructional assistance appropriately as learners gain competence.

You can go to the Chapter of Problem Solving, Critical Thinking, and Argumentation (2.5 Cognitive Tutor for problem solving) and Learning Mathematics (4.5 Cognitive Tutor for teaching algebra) to get more detailed information of Cognitive Tutor and its effectiveness.

Criticisms of the ACT-R theory and responses[edit | edit source]

Since the ACT-R theory maintains that acquiring or understanding a skill (or cognitive task) simply entails learning the individual productions that constitute it, it has faced the criticism from a constructivist learning point of view that the understanding of knowledge or skills is constructed by the learner him/herself, rather than achieved in a pre-specified manner.[13]

This, coupled with the theory’s idea that learners’ answers or solutions to a problem should conform to or be pigeonholed into certain sets of production rules, has given it a somewhat behavioral-oriented approach to cognition,[13] stifling elements such as metacognition. Anderson et al.[33] respond by stating that the ACT-R’s approach shares similarities with a behaviorist one in terms of how instruction should focus on breaking down a skill or task into components, but they argue that the ACT-R represents the task in a more abstract way (likely more transferable between contexts) than typical behavioral methods.   

In addition, John Anderson has acknowledged that, since a key educational or instructional design implication of the theory is to foster the acquisition of individual production rules in order to accomplish a task, the primary emphasis of the theory could be considered to be efficiency in learning.[13] Unsurprisingly, this priority might seem questionable to those who value learning depth or richness rather than efficiency or speed.[13] Anderson’s response is that depth in learning can simply be interpreted as enriching declarative and production (procedural) knowledge, which entails practice and feedback.[13]   

Glossary[edit | edit source]

Assignment of meaning
When meaning is assigned to a perceived stimulus
The development of a skill to an automatic level where it becomes an implicit process that does not require much thought
Cognitive Theory of Multimedia Learning
Mayer’s theory based on the Dual-coding Theory, the notion that cognitive load must be managed in learning, and the notion that learning entails organizing and integration information
A way of sorting mental information into meaningful categories and structures; A “building block of cognition”
Conditional knowledge
Knowledge of different strategies and when and why to use them; The knowledge of “knowing why”
Declarative knowledge
Factual knowledge such as knowing capital cities and algebra formulas; The knowledge of “knowing what”
Dual-Coding Theory
Paivio’s theory that providing information in both visual and textual format may benefit learning
Episodic memory
Memory that is specific to each individual’s personal experiences
Essential learning
Cognitive demands that are necessary for understanding the to-be-processed information
Extraneous cognitive load
Anything that causes cognitive load outside of the original cognitive task
Functional magnetic resonance imaging. A neuroimaging technology that is able to monitor brain activity by detecting changes in blood flow to activated areas of the brain
memories cannot be recalled
Incidental processing
Cognitive demands that are useful for understanding the to-be-processed information, but not entirely necessary
Intrinsic cognitive load
The cognitive load required of any given task
Long-term memory
Memory that is developed over days, months, years and/or decades of time. The permanent accumulation of memory developed over a lifetime
Procedural knowledge
Knowledge of how to complete daily tasks, such as driving a car, skiing, or making coffee; The knowledge of “knowing how”
An extremely common/prominent concept
When information previously stored in short- or long-term memory is remembered
When information previously stored in short- or long-term memory is reconstructed at recall, but not remembered exactly
Referential holding
When one holds information temporarily within working memory while other information is simultaneously being processed
Cognitive repetition which allows information to remain active in short- or long-term memory
The act of transferring information out of long-term memory and into working memory
A temporary framework of supports while an object (or schemata) is “under construction” that is taken away when completed and the support is no longer needed
Schema or Schemata
Cognitive structure(s) that help organize knowledge and guide thinking, perceptions and attention
Semantic memory
Nonspecific memory of general concepts and procedures; Not related to specific individual events or experiences
Sensory register
A cognitive function within the working memory in which perceived input is stored to receive meaning
Spreading activation
The recall of an idea triggered by the recall of another associated idea

References[edit | edit source]

  1. a b c d e f g h i j k l m n o p q r s t u v w x y z aa ab Bruning, R., & Schraw, G. (2011). Cognitive psychology and instruction (5th ed.). Pearson Education
  2. Khajah, M. M., Lindsey, R. V., & Mozer, M. C. (2014). Maximizing students' retention via spaced review: Practical guidance from computational models of memory. Topics in Cognitive Science, 6(1), 157-169. doi:10.1111/tops.12077
  3. a b c d Bonner, M. F., & Grossman, M. (2012). Gray matter density of auditory association cortex relates to knowledge of sound concepts in primary progressive aphasia. The Journal of Neuroscience, 32(23), 7986-7991.
  4. a b Chi, M.T.H., de Leeuw, N., Chiu, M., & La Vancher, C. (1994). Eliciting self-explanations improves understanding. Cognitive Science, 18, 439-477.
  5. Bruner, J.S., Goodnow, J.J., & Austin, G.A. (1956). A study of thinking. New York, NY: Wiley
  6. Remue, J., De Houwer, J., Barnes-Holmes, D., Vanderhasselt, M., & De Raedt, R. (2013). Self-esteem revisited: Performance on the implicit relational assessment procedure as a measure of self- versus ideal self-related cognitions in dysphoria. Cognition and Emotion, 27(8), 1441-1449. doi:10.1080/02699931.2013.786681
  7. Anderson, J.R. (2005). Cognitive psychology and its implications (6th ed.). New York: Worth
  8. Jui-Pi Chien. (2014). Schemata as the primary modelling system of culture: Prospects for the study of nonverbal communication. Sign Systems Studies, 42(1), 31-41. doi:10.12697/SSS.2014.42.1.02
  9. Le Grande, M. R., Elliott, P. C., Worcester, M. U. c., Murphy, B. M., Goble, A. J., Kugathasan, V., et al. (2012). Identifying illness perception schemata and their association with depression and quality of life in cardiac patients. Psychology, Health & Medicine, 17(6), 709-722. doi:10.1080/13548506.2012.661865
  10. Sternberg, R. J., & Sternberg, K. (2012). Cognitive psychology (6th ed.). Belmont, CA: Wadsworth.
  11. a b van Kesteren, Marlieke T. R., Rijpkema, M., Ruiter, D. J., Morris, R. G. M., & Fernàndez, G. (2014). Building on prior knowledge: Schema-dependent encoding processes relate to academic performance. Journal of Cognitive Neuroscience, 26(10), 2250-2261. doi:10.1162/jocn_a_00630
  12. Rumelhart, D.E. (1981). The building blocks of cognition. In J.T Guthrie (Ed.), Comprehension and teaching: Research reviews (pp. 3-26). Newark, DE: International Reading Association.
  13. a b c d e f g h i j k l m n o p q r s t u v w x y z Anderson, J. R. (1993). Rules of the mind. Hillsdale, NJ: Lawrence Erlbaum.
  14. Trillingsgaard, A. (1999). The script model in relation to autism. European Child & Adolescent Psychiatry, 8(1), 45. Retrieved from
  15. Craik, F.I.M. (1979). Human memory. Annual Review of Psychology, 30, 63-102.
  16. Craik, F.I.M., & Lockhard, R.S. (1986). CHARM is not enough: Comments on Eich's model of cued recall. Psychological Review, 93, 360-364.
  17. a b Paivio, A. (1986). Mental representations: A dual-coding approach. New York, NY: Oxford University Press.
  18. Butcher, K.R. (2006). Learning from text with diagrams: Promoting mental model development and inference generation. Journal of Educational Psychology, 98, 182-197.
  19. a b Mayer, R.E. (2001). Multimedia learning. New York, NY: Cambridge University Press.
  20. van Merrienboer, J.J.G., & Sweller, J. (2005). Cognitive load and complex learning: Recent developments and future directions. Educational Psycholog Review, 17, 147-177.
  21. Mayer, R. E. (2008). Applying the science of learning: Evidence-based principles for the design of multimedia instruction. Cognition and Instruction, 19, 177–213.
  22. a b c Anderson, J.R. (2010). Cognitive Psychology and its implications. (7th ed.). New York, NY: Worth.
  23. Koriat, A., Goldsmith, M. & Pansky, A. (2000). Toward a psychology of memory accuracy, In S.Fiske (Ed.), Annual review of psychology, (pp. 481-537). Palo Alto, CA: Annual Reviews.
  24. a b Woolfolk, A., Winnie, P. H., & Perry, N. E. (2016). Educational Psychology (Custom Edition). Toronto, ON: Pearson Education.
  25. Erdelyi, M.H. (2010). The ups and downs of memory. American Psychologist, 65, 623-633.
  26. Anderson, J.R., Fincham, J.M., Qin, T., & Stocco, A. (2008). A central circuit of the mind. Trends in Cognitive Psychology, 12, 136-143.
  27. Ismail, M. N., Ngah, N. A., & Umar, I. N. (2010). The effects of mind mapping with cooperative learning on programming performance, problem solving skill and metacognitive knowledge among computer science students. Journal Of Educational Computing Research, 42(1), 35-61. doi:10.2190/EC.42.1.b
  28. McClelland, J. L. (1988). Connectionist models and psychological evidence. Journal of Memory and Language, 27, 107-123.
  29. Vickers, Douglas, & Lee, Michael D. (1997). Towards a dynamic connectionist model of memory. Behavioral and Brain Sciences, 20, 40-41. doi:10.1017/S0140525X97460016
  30. a b c Anderson, J. R., & Matessa, M. (1997). A production system theory of serial memory. Psychological Review, 104(4), 728-748. Retrieved from
  31. Anderson, J. R. (1990). Cognitive psychology and its implications. New York, NY: Freeman.
  32. Schraw, G. (2006). In P. A. Alexander & P. H. Winne (Eds.), Handbook of educational psychology (pp. 825-847). Mahwah, NJ: Erlbaum.
  33. a b c d e f g h Anderson, J. R., Corbett, A. T., Koedinger, K. R., & Pelletier, R. (1995). Cognitive tutors: Lessons learned. The Journal of the Learning Sciences, 4(2), 167-207. Retrieved from:
  34. a b c d e f g h i Anderson, J. R. (1996). ACT: A simple theory of complex cognition. American Psychologist, 51(4), 355-365. Retrieved from
  35. Ritter, S., Anderson, J. R., Koedinger, K. R., & Pelletier, R. (2007). Cognitive tutor: Applied research in mathematics education. Psychonomic Bulletin & Review, 14(2), 249-255. Retrieved from http://