Chapter 2

Consolidation

Introduction: Defining consolidation

One of the difficulties in studying consolidation is that it is not well defined. Consolidation, as it is generally referred to, is what happens between the time that an event occurs and when the memory for the event becomes permanent. In this chapter the focus will be on the time course of consolidation and the experimental factors which impact it; throughout the rest of the dissertation the focus will tend to be on the process involved and the physical factors which account for it. Most consolidation studies focus on two major issues: 1) What is the time course of consolidation? and 2) What is the process involved? That there is a time course at all might be somewhat surprising in the light of many connectionist models in which learning comes in the form of changes of connection strength between neurons; such changes could easily occur in a fraction of a second. Part of the difficulty of determining the time course of consolidation appears to be that there are at least three different events that impact the permanence of memory. Because all of these events impact memory it is easy to confuse their effects, and the consolidation literature is sometimes seen as being full of contradictory results. With care, however, a separation between the three events can be made and a clean study of each is possible. These three events can be broken into temporal categories ranging from the very short to the extremely long term.

The shortest duration event is the one that will be referred to as consolidation in this dissertation. The evidence for this type of consolidation comes from a large number of experiments which show that a number of factors, ranging from psychological intervention to electroconvulsive shock, can interfere with the memory for an event, and that the degree of interference is directly related to the amount of time between the event and the interfering factor. The time course of this type of consolidation is on the order of a few seconds. The physical changes associated with this type of consolidation form the basis of the learning system discussed in subsequent chapters.

The second event which impacts the permanence of learning involves the chemical process required to make the physical changes whether they come in the form of changes in synaptic efficiency or the coding of RNA sequences. For the remainder of the dissertation this will be called secondary consolidation. There is evidence to show that this process takes substantially longer than what I am calling consolidation, perhaps on the order of a half an hour or longer (for reviews discussing the differences in the two types of consolidation see (Landaur, 1964; Miller and Marlin, 1984)). One of the ways of distinguishing this event from consolidation involves the types of interference possible. Whereas simple interference, such as resulting from psychological manipulation, is possible with consolidation, the longer chemical process is not susceptible to such interference, but more substantial interventions are necessary, generally involving blocking the chemical process of the synaptic change.

A third type of event has been labelled as consolidation and involves very long periods of time, on the order of years. The evidence for this event comes from patients with damage to the hippocampus. It has been found that damage to the hippocampus selectively impairs memories; the more recent the memory the less likely it is to be retrievable once the hippocampus has been damaged. Very old memories, on the other hand, can still be retrieved even after such damage. The most popular current theory posits that memories consolidate in the hippocampus and that the process takes extremely long periods of time. Other explanations are plausible however. It is possible, for example, that the hippocampus is a kind of interface between an organism's perception of the world and its memories. In such a scheme damage to the hippocampus would impair the ability to retrieve memories, but not the memories themselves. The availability of old memories may be due a kind of bypass mechanism; when a memory has been reactivated enough times, it may no longer require the hippocampus. Such a theory is in accordance with research done on cognitive maps where it has been shown that the hippocampus is central to the processing of spatial memories (O'Keefe, 1989; Squire, 1992). Regardless, damage to the hippocampus is quite specific and not strictly interesting from a credit assignment point of view because there is no evidence that the "consolidation" process in the hippocampus has any effect on the strength of learning. On the contrary it is well known that the strength of learning, as measured by the ability to recall, diminishes with time; therefore while the hippocampus may well play a role in how a memory is retrieved; there is no reason to believe that it is a factor in the organism's ability to retrieve a memory beyond providing the retrieval mechanism.

The two longer term events are called consolidation because the right kind of interventions during the time course of these events can affect an organism's ability to recall a memory. This is also the case for short term consolidation, but the important difference lies in the types of interventions possible, which, in this case, include psychological interventions. This susceptibility to psychological intervention means that the cognitive system itself can impact the strength of learning. The implications of this fact are enormous; it means that the cognitive system does not have to treat events neutrally. In terms of the prior discussion of credit assignment, this susceptibility provides one method for the cognitive system to translate importance into effects on learning.

Such a theory of consolidation is not traditional. Many researchers, include most consolidation theorists view consolidation as a postprocessing event. Because consolidation is generally viewed as fundamentally separate from information processing and because it is considered a passive "unconscious" operation many researchers in learning and memory have not found it to be a useful construct (Weingartner and Parker review this position while arguing against it (1984)). This dissertation will take exactly the opposite position, presenting a model of consolidation based on a radically different perspective. The claim is that consolidation is not separate from processing, but rather that consolidation stems directly from processing; as such consolidation is an active operation which is fundamentally tied to all of the factors which affect processing. Further, it will be shown that consolidation is central to the human learning biases towards contiguity, repetition and importance.

The rest of this chapter will be divided into several sections. In the first section the evidence for consolidation will be reviewed. This evidence will show the basic time course of consolidation and some of the interventions possible. The second section reviews a related paradigm called reminiscence. Reminiscence, like consolidation, is not well understood, even to the point where its existence is sometimes questioned. The reminiscence data is critical because it establishes the linkage between consolidation and importance. One of the ways in which this is done is by presenting a model of consolidation which can account for the reminiscence data, data which has not been satisfactorily accounted for by any other model. The last section is an overview of the implications of the model with respect to credit assignment. Later chapters will examine the model and its implications for credit assignment in substantially greater detail.

Evidence for consolidation

The goals of this section are to provide an overview of the consolidation literature and to build a case as to what the theoretical constraints are that a consolidation model must meet. There are several important constraints that will be developed. First, is the length of the consolidation process. Second, will be the fact that consolidation is not merely a ballistic process, but is an active process that can be affected in positive and negative ways according to the cognitive state of the organism. Further, these effects translate directly into changes in the strength of learning. Finally there are the factors that impact consolidation.

Retrograde Amnesia

Most of the consolidation literature concerns the phenomenon of retrograde amnesia. Retrograde amnesia refers to the loss of a memory due to some post-learning event, usually some type of trauma or shock. The fact that at one time most of the retrograde amnesia data involved trauma or shock is actually responsible for a great deal of the confusion involved in defining consolidation since some types of trauma might lead to any of the three types of interruptions which have been used to define consolidation. The basic premise of the consolidation literature is that if learning were instantaneous, then retrograde amnesia could not easily be explained; after all, there would be no way that the trauma could select the recent memory from any other memory. If we grant that the performance impairment caused by retrograde amnesia is in fact a reflection on a disruption of the learning process, then retrograde amnesia holds the potential to reveal a great deal about the time course of consolidation. Performance, or recall, is disrupted for events that occurred in the recent past. By controlling the time between the events to be learned and the disrupting events it should be possible to determine the nature of the learning curve over time. There exists a large body of studies dealing with retrograde amnesia in its various forms, whether it is induced through electroconvulsive shock (the most common method), drugs, or head injury. Some of the studies use animal subjects, others, particularly those examining the effects of head injuries, use human subjects. Due to the nature of the experiments however, many of them involving physical damage to the brain, most of the studies use animals. The most often used paradigm has been to put a subject, usually a rat, in a one-trial learning experiment and then at a controlled time interval later use electroconvulsive shock (ECS) to induce memory disruption.

Electroconvulsive Shock

Chorover and Schiller (1964) presented a typical ECS experiment. In this experiment, rats were tested for their average time taken to step down from a platform onto a grid floor. Animals were then matched on the basis of their step-down latencies (SDL) and assigned to three experimental groups and two control groups. In the one-trial learning phase of the experiment when the rats stepped down onto the grid they were given foot shocks. The first group of rats was simply given the foot shocks. The second group had subgroups that also received ECSs at intervals of 0.5, 2, 5, 10, or 30 seconds after the foot shock depending on the subgroup. The third group only received the ECS. This procedure was repeated over three consecutive days and the SDLs for each group were recorded. Virtually any learning theory would predict that the rats would learn a relationship between stepping on the grid and getting a footshock. In fact this is exactly what the results showed in the absence of an ECS. Animals that only had footshocks had an average latency time of about 30 seconds. Also of note was that with a 30 second delay between the foot shock and the ECS, performance was indistinguishable from the group of rats which did not receive an ECS. At a delay of 30 seconds, therefore, it does not appear that the ECS interfered with learning. However, as the foot shock to ECS interval decreased, the latency times fell and were especially short in the 0.5 and 2 second groups. Conclusions that could be drawn from this study, based upon the latency times, include evidence that learning is complete after about 10 seconds, and that the bulk of learning takes place in 2 to 5 seconds.

The premise that ECS work is built upon appears to be sound, but doubts about the validity of this line of research were raised from its early stages. The most common explanation for retrograde amnesia due to ECS is that ECS disrupts consolidation. Spevack and Suboski surveyed a number of alternative explanations to consolidation in 1969. All of these were variations on a hypotheses that no process was disrupted, but that the shock itself became a part of the learning. It was somehow related to the training, perhaps working as a highly aversive stimulus or working to increase the avoidance response. For example, a rat might learn to run to the right in a Y-maze to gain some reward, but with a huge electric shock soon following it, the shock could potentially be seen as part of the outcome of running to the right. If the pain outweighed the reward, running to the right would be something to be avoided. A number of studies were done that argue against these possibilities (Spevack and Suboski, 1969, King, 1967). While these explanations fell short of providing a more compelling explanation, they raised the possibility that there may be more to retrograde amnesia with respect to ECS than can be determined by a simple analysis. Other problems existed with the literature on ECS and retrograde amnesia. The time course of consolidation reported varied from under ten seconds (Chorover and Schiller, 1964) to as long as a day or more (Misanin, Miller and Lewis, 1968). Chorover and Schiller did point out that different levels of footshock can lead to different lengths of consolidation, but the variations they describe are on the order of seconds not hours.

More recently, the doubts about retrograde amnesia have been given new credence. Among the new issues raised are that the overall effects of ECS on the nervous system simply are not understood well enough to make any strong claims. ECS could, for example, somehow damage the retrieval mechanism needed for certain memories. Miller and Malin point out that more recent studies have found that in many cases retrograde amnesia is not permanent (Miller and Malin, 1984). This would indicate that what may be damaged is in fact the ability to retrieve the new memories, not the memories themselves. Miller and Malin's studies indicate that consolidation is a relatively quick process, taking less than five seconds. They go as far as saying that "we believe that this extreme rapidity of consolidation is one of the few established facts concerning the nature of consolidation." It is worth pointing out that the five second interval is exactly the gradient proposed by Hull in reviewing classical conditioning experiments (reviewed in (Hilgard and Bower, 1948)) as being the maximum time which can elapse between a response and reinforcement, for the reinforcement to have an effect without depending upon other mechanisms.

Retrograde Facilitation with Drugs

Retrograde studies using drugs actually fall into two study types: those that study amnesia caused by the drugs (Parker and Weingartner refer to this as facilitating deficits in memory), and those that study facilitation of memory caused by the drugs. Parker and Weingartner (1984) present a review of a number of these studies and the effects of the various drugs studied. Experiments using drugs follow the same general paradigm as the ECS experiments, but often use humans as subjects. Subjects are divided into test groups and control groups and then are given some test material to be learned. Immediately after this the test groups are administered the drug being tested. Later both groups are tested for recall. Some drugs produce enhanced recall and some depress recall. Additionally, some tests have been done with the same drugs where the drugs are administered before the test material is given.

One of the most revealing insights from these experiments is that some drugs produce the opposite effect when given before training than when given after training. Alcohol, diazepam (valium), and nitrous oxide, for example, are known to produce amnesia when given before training, but can actually work to facilitate memory when given after training. It has been hypothesized that this is because consolidation and encoding involve different processes. This explanation appears to be lacking, however, since the effects of a drug administered before training would be unlikely to have worn off before the memory is encoded. Other explanations seem reasonable however. The next section concerns the fact that psychological interference can impact the consolidation process. A drug like alcohol could serve to generally dampen neural activity. Then in the case where it is administered after the training there will be less inhibition, and therefore interference, from other active processes. If it is administered before the training the activity related to training will itself be dampened leading to lessened consolidation. Such a theory is in partial accord with Parker and Weingartner's own conjecture which also hypothesizes that drugs may stimulate the reward system as well.

As with ECS, there are a number of important questions that have not been answered in regard to the overall effect of drugs on the nervous system. What is fascinating about these studies, however, is that they raise the possibility that consolidation is not simply a process that is started and then either runs its course or is interrupted. On the contrary, the drug studies would appear to indicate that consolidation is a malleable process. Such evidence would appear to contradict theories that hypothesize that consolidation is an analytical process which aims to determine whether or not a memory is worth storing. It is difficult to see how drugs would affect such a process, and even harder to account for asymmetric effects such as are produced by alcohol.

Further evidence for the theory that consolidation is an active, malleable process can be found in another class of amnesia studies which indicate that consolidation can be affected by cognition itself.

Retrograde Amnesia Due to Interference

At least one study does exist that is not subject to the kinds of problems inherent in the retrograde amnesia work done with ECS and drugs (Tulving, 1969). In this study, amnesia was not induced physically, but through interference from another task. Tulving presented subjects with lists of words to be remembered. Subjects were instructed that whenever a name of a famous person (such as Christopher Columbus) appeared in the list to make sure that they remembered that name. During the recall test they were instructed to recall the name first, before going on to the other items from the list. The idea was that the task of recalling the famous name would interfere with the task of recalling the other words. The questions to be answered were; if there would be interference at all, and if so what would the temporal nature of the interference be.

The lists were 15 words long. The high priority names appeared in positions 2, 8, and 14 in the test lists and not at all in the control lists. When the presentation rate was 0.5 seconds or 1 second per word, recall of words from input positions 7 and 13 (immediately preceding the positions in question) was approximately twice as high in the control lists as in the high priority lists. In other words when subjects tried to remember the high priority word it somehow interfered with their ability to remember the word immediately preceding it. At a presentation rate of 2 seconds per word there was almost no difference. Also the presence of the high priority words did not seem to affect the recall of words immediately following them. Tulving interpreted these results as evidence of consolidation, the high priority words interfering with the ongoing traces of the preceding words. In this case, only the immediately preceding word was greatly interfered with, meaning that the length of the trace would be between 0.5 and 2 seconds. An alternative explanation could involve rehearsal. If subjects were rehearsing as the test went along, they would switch to rehearsing the high priority word upon its presentation. However, Tulving points out that this is unlikely due to the asymmetry of the effects. Subjects would be just as likely to forgo rehearsing the words after the high priority word.

The Tulving study indicates a kind of negative access to the consolidation process. In this case it appears that because attention was shifted away from the consolidating memory learning was adversely affected. In and of itself this appears to be a useful learning effect; if attention can be equated with interestingness or usefulness then a shift in attention would seem to indicate that the previous focus of attention was unworthy of being learned. This recalls the earlier discussion of complexity which equated processing with worthiness of learning. Other results, however, will show that this is misleading. There is a difference between a cognitive shift in attention of the type found in the Tulving study, and an environmentally driven shift in attention as might occur when passing through a doorway or mountain pass. Although working out the basis for this distinction will require further discussion (the mechanisms are discussed in Chapter 5 and the process in the conclusion), it does appear to give credence to the case that the cognitive system has some type of access to the consolidation process. This notion of cognitive access will be further developed later in this chapter.

Distributed vs. Massed Practice

Another body of work that provides insight into consolidation concerns recall performance in distributed versus massed practice trials. There are two related paradigms involved. Typically both test recall of items in list learning experiments. In one paradigm the variable to be studied is the length of time between words in the presentation of the list. In the second paradigm a target word is picked and the test condition is the frequency of occurrences of the word. In one group, the massed practice group, the word will appear a set number of times consecutively within the list. In the other group, the distributed practice group, the target word will appear the same number of times in different locations through the list. Essentially there are "rest" periods for the target trace between its presentations in the distributed test while in the massed practice test the target word will appear the same number of times, but consecutively within the list; there is no rest. The intent is to test the temporal interactions going on in learning. The second test paradigm is especially interesting because it appears to get around the interference problem found in the Tulving study. It is doubtful that a word would interfere with itself, so a strict interference model would predict that performance would be better in the massed practice case than in the distributed practice case. However, given that the time course of learning may be longer than the presentation rate, then each word may be still in the learning process when it is again presented. Then the new presentation may have little effect. In the distributed practice case, on the other hand, there will be "rest" periods between the presentations of any given word. This should allow the consolidation process to be relatively complete by the next presentation of the word, and hence the prediction would be that performance would be enhanced.

Hintzman (1969) reported results on experiments on the apparent frequency of words in massed versus distributed practice. Three experiments were performed along the lines described above. Two types of test were used: a paired-comparison test in which the decision was to choose the more frequent of two alternatives. In the second test a judgement of the number of appearances was asked for. In the first experiment, the test words appeared consecutively 0, 1, 2, 4, 6, and 10 times. In the second and third experiments, the spacing or number of items intervening between two repetitions was varied. Apparent frequency, as measured, by reported responses, increased with spacing. This would appear to back up Hintzman's hypothesis that in the trials where the words are presented consecutively, the presentations tend to meld together instead of being treated as separate cases. However, it is important to note, that the analysis is not simple. As the Tulving study indicates, there are a number of factors that need to be carefully considered. For example, in the distributed case the intervening words could actually interfere with the traces of the target word as was noted before. The interference model could even be stretched such that words would interfere with themselves. Such an interpretation raises other problems, however, such as why rehearsal is effective.

Reminiscence

So far the evidence reviewed for consolidation leaves a number of possible explanations available. There is compelling evidence that there is a process lasting five seconds or less which can be affected in both positive and negative ways. This description is similar to the learning theory developed by Hull which hypothesized that learning was due to a stimulus trace that lasted approximately five seconds and rose and fell in strength according to factors such as reward (Hilgard and Bower, 1966). Negative effects can come from factors such as electric shocks, drugs, or interference from other cognitive events. Thus far the only evidence for positive effects has come in the form of certain drugs which leaves open the question of whether cognitive events can have a positive influence on consolidation. There is, of course, a large literature on the effects of rewards on learning, but such rewards are physically manifested, such as food. This is not to say that such instances are not important to the study of consolidation, indeed they are and they will be addressed later in the dissertation, but they do not address the issue of positive cognitive effects. Fortunately, there is a related body of evidence which not only affirms that cognitive events can have a positive impact upon consolidation, but which also provides insight into the issue of the process underlying consolidation. This evidence, called reminiscence, also happens to be the source of a great deal of controversy for learning theorists.

Reminiscence refers to an improvement in recall for an event over time. More properly, reminiscence denotes improvement in performance of a partially learned act that occurs while the subject is resting. In other words, performance for a trained item might be poor right after the training trial, but it can actually improve after a period of rest. A typical study showing results of this type was done by Bregman (1967).

Another large body of evidence for reminiscence concerns motor learning tasks such as pursuit rotor tests (Eysenck and Frith, 1977). While it is not clear that there is a direct correlation between motor tasks and cognitive tasks, some of the variations done have suggested that there are cognitive components that could account for the reminiscence effect.

The two explanations for reminiscence most often advanced are consolidation and inhibition. Consolidation theorists generally explain reminiscence by the fact that in immediate performance tests, the consolidation of the training trace has not yet completed and therefore performance will be poor. After a period of rest, consolidation finishes and solidifies the memory trace, leading to better performance. Later degradation of performance comes after consolidation ends and normal forgetting sets in. The difficulty with this theory is the evidence that consolidation is complete in approximately five seconds. The longer time course associated with secondary consolidation is not helpful either because it does not explain why there is no performance increase associated with the low arousal case. Inhibition theory, on the other hand, proposes that during practice there is a build up of reactive inhibition which negatively affects performance. During rest periods, this inhibition has a chance to dissipate, leading to improved performance. However, in a pair of studies concerning the relationship of arousal and learning, Kleinsmith and Kaplan (1963; 1964) obtained results which challenged both of these explanations.

Kleinsmith and Kaplan found that the reminiscence effect was very pronounced at high levels of arousal, and nonexistent at lower levels. It is worth noting that arousal was measured by item rather than by subject; so, for example, one subject would have high and low levels of arousal within one trial. At high levels of arousal the recall curve is essentially a U rather than the traditional inverted U associated with learning. Immediate recall is high, but quickly drops off to very low levels. However, after a period of several minutes recall actually begins to improve, by as much as 400% in the original experiments, before falling off gradually in the long term due to natural forgetting. The Kleinsmith and Kaplan results are difficult to interpret from a consolidation perspective because of the long time lag involved. If consolidation only takes five seconds then it is completed long before reminiscence effects begin to appear. The consolidation model on its own also affords no explanation of the low arousal case when recall is high initially. Interference models have similar difficulty explaining these effects. The difficulty is in determining what is interfered with and why such interference only happens when arousal is high. For these reasons numerous attempts have been made to replicate these experiments with most reproducing the original results (Eysenck, 1977; Weingartner and Parker, 1984; Revelle and Loftus, 1990).

Theoretical Background

A number of theories have been proposed to deal with the evidence for reminiscence. Most use consolidation, others propose alternative mechanisms such as interference, still others claim that the evidence for consolidation is too weak to base a theory on.

Impediments to a Far-Reaching Theory

One such argument, made against consolidation, is made by Keppel (1984). Keppel claims that the evidence cited in support of consolidation is "not very strong." In particular he reviews some of the literature on reminiscence and arousal and claims that while most of the evidence is supportive, it is not compelling. Keppel claims that results such as those obtained by Kleinsmith and Kaplan are so dramatic that they should be easily replicated. However, Keppel himself points to studies that strongly support the Kleinsmith and Kaplan data. The studies which he claim only marginally support the data, or do not support it all, use differing test paradigms, such as inducing arousal with white noise. Arousal induced through noise will have additional and different effects on a subject than arousal stemming from some part of the test itself such as a particular test word. The noise itself might serve as a distractor, shifting part of the subject's attention away from the test. In general, it would seem that comparisons made between naturally occuring and induced arousal would seem tenuous at best.

Keppel concludes that cognitive theorists will "avoid explanations that view the human as a passive organism completely at the mercy of involuntary physiological processes." On the contrary, it is the very fact that consolidation and arousal are involuntary that makes them powerful. As was argued in the first two chapters, infant learning and learning in new domains must be automatic and not deliberate. The power of consolidation, as it will be developed through the model presented in this dissertation, is that, in conjunction with arousal and other factors, it affords a different kind of evaluation, one which is flexible and sensitive to a range of architectural considerations.

Keppel does raise some salient issues. The literature does not appear to support any one theory. A number of reasonable mechanisms have been proposed, but when taken on their own, each falls short of providing a complete explanation of the data. When taken together, the task of analyzing the end product becomes difficult due to the interactions of the pieces.

Theoretical Factors

Interference Theory

The leading argument used to explain reminiscence without the use of consolidation is interference (or inhibition) theory. Interference theory posits that the poor early recall is due to interference from the original presentation of the word. As the subject rests, then the inhibition dissipates and performance increases. Peterson (1966), and Parker and Weingartner (1984) review some of the problems that inhibition theory cannot explain on its own. However, both propose models that use consolidation in conjunction with a form of interference.

Inhibition is a well established neurological fact (Milner, 1957). The question with regard to consolidation and reminiscence, is whether inhibition on the molecular level, the neuronal level, shows up at the molar level; the behavioral level. A number of relevant memory priming studies have been done to show just such effects. Neely presents one such study which shows an inhibitory effect at the molar level (Neely, 1977). The task involved was a word-nonword classification task. Prior to each visually presented target string a priming string was presented. Subjects could expect certain relations between the priming word and the target if the target were a word. For example, if the priming word were BIRD they could expect the target to be the name of a type of bird, this was called a no shift trial, because the prime was expected to be related and attention would not need to be shifted. In the shift case, if the priming word were BUILDING the subject could expect a part of the body to be the target word. The control condition was a priming string XXXX in which case the subject could expect a bird, or a body part, or a building part equally often. The subjects were then tested on their reaction times for different combinations of expectations and shifts. The results showed both priming and inhibitory effects when compared against the control conditions. Inhibition was most pronounced in trials where there was a shift to a totally unexpected word that was unrelated to the priming word.

The evidence for inhibition, and studies such as Neely's and the Tulving research discussed earlier show that any complete model of learning must account for interference effects. It is still far from clear, however, how interference alone can be used to account for the reminiscence data.

Two Stages of Memory

Many information processing models include more than one stage of memory. A typical model would include some short term memory store as well as long term memory. The evidence for at least two stages of memory is compelling (Miller and Marlin, 1984) and a number of other models (Peterson, 1966; Neely, 1977) have included two distinct mechanisms. Miller and Marlin call their two memory systems passive and active storage. These would correspond roughly to long term and short term memory. They argue that the establishment of passive storage is a consequence of its representation in active storage. What most of the models with two stages have in common is a relatively transient, yet powerful system for short term learning, and a weaker, yet more permanent system for long term learning.

The two stage model is appealing because it can easily account for half of the Kleinsmith and Kaplan paradigm. In the low arousal case short term performance is high because the items are in the first stage of memory when retrieval is easy. However, such a model cannot by itself account for reminiscence; the problem is how such a model can account for the fact that in the high arousal case there appears to be no short term memory. Indeed one of the reasons that the reminiscence data has come under fire is exactly because it appears to directly contradict the standard memory model which has long and short term memory.

Fatigue

A neurological concept that is less readily accepted is fatigue. The reluctance to accept the fatigue construct comes as the result of experiments that involved constantly stimulating neurons. Since such neurons can keep firing indefinitely it has generally been presumed that they do not fatigue. Such data are not pertinent to the fatigue hypothesis, however, because these experiments have not typically measured the output of the stimulated neurons. Even a heavily fatigued muscle, for example, will still have the ability to contract; where fatigue shows up is in the muscle's diminished capacity to bear weight. Atwood, experimenting on crustaceans, has found in numerous cases (reviewed in (Atwood and Nguyen, 1990)) that responses of a neuron are "markedly reduced" after chronic stimulation. While these data cannot be directly extrapolated to humans, they support the general theory.

At the neuronal level the firing of neurons is a physical activity. Like any other physical activity this one requires at least one energy source. In the case of neurons there are actually a number of materials necessary for producing firing, most important of these being transmitter substances. Fatigue would presumably come when the consumption of some or all of those materials as a result of the neuron's firing exceeds their replacement rate. As in the case of a muscle, such neurons might continue to fire when stimulated, but would have a diminished capacity to stimulate other neurons.

In terms of the Kleinsmith paradigm some of the best neurophsyiological evidence comes from Artola and Singer (1993) who point to experiments which result in a "depression of synaptic transmission" which has a time course of 5-20 minutes, exactly the interval that would be expected given the Kleinsmith paradigm data. Ito (1992) also reviews evidence of posttetanic depression in which "repeated activation of a synapse leads to an enduring decrease of strength somewhat like fatigue."

Much of the early research on fatigue was done at the turn of the century by Kraepelin and his students. There are also modern studies that definitely show fatigue-like effects. Pomerantz, Kaplan and Kaplan (1969) found that in presenting subjects with repeated flashes of a single letter each presentation had a positive effect on the subject's ability to recognize the letter up to a certain point. At that point, performance did not level off as might be predicted, but instead started to decline. This was interpreted as being a result of satiation and fatigue at the neural level.

Similar effects of fatigue can be found in perceptual data. One such phenomenon is known as the "tilt aftereffect." When stimulated intensely for a period of time, cells in the visual cortex suffer a temporary reduction in responsiveness. For example, if you were to stare at a pattern of bars tilted slightly counter-clockwise for a period of time and then looked at a pattern of bars that were vertical, the vertical bars would appear to be tilted slightly clockwise. This effect has been tied to fatigue in the cortical cells (Sekuler, Blake, 1985). These results are sometimes received skeptically because they can also be explained by receptor fatigue which is different than neural fatigue. The Necker cube, an illusion that while visual is a 3-dimensional effect and therefore probably does not directly involve receptors, can be similarly explained using fatigue. The basis of this explanation is the hypothesis that the two possible interpretations of the Necker cube inhibit each other; at any given time one interpretations dominates the other and inhibits it, but eventually it fatigues and the other will begin to dominate.

The analysis of fatigue provides an excellent example of the difficulty of studying cognition purely at the neural level. The amount of provable knowledge at this level is extremely limited. Models built out of only what is known would necessarily be incomplete. On the other hand simply speculating that something like fatigue exists is also dangerous. Hebb faced this very dilemma when he first presented his cell assembly theory. Hebb knew that his model needed certain factors, in particular inhibition, to make it work, but because there was no provable evidence for these factors he decided to leave them out of his model. The problem with such a decision is that an incomplete model will have glaring weaknesses. In the case of the cell assembly model this was shown dramatically in simulations done by Rochester, et al. (1956). These simulations are still cited to this day as evidence for why cell assemblies are not plausible. This is doubly unfortunate because the same paper shows that the cell assembly construct is plausible with the addition of an inhibitory factor. Had Hebb included inhibition in his original model the cell assembly concept might be much more prominent in the literature and it certainly would be given closer scrutiny. The issue for cognitive theorists is how to decide when such factors can be plausibly hypothesized. One of the major thrusts of this dissertation will be the necessity of the fatigue construct. Because neural fatigue is still not generally accepted by neuroscientists the basis for including fatigue will necessarily be theoretical. Throughout this dissertation I will return to the fatigue concept to show how it is useful in providing clean explanations for data that is otherwise difficult to explain. This is a prime example of why a model that bridges the neural and behavioral levels is potentially so useful, because in the absence of good evidence at one level, the other level may provide the additional constraints necessary to complete a model.

The first piece of evidence that makes the neuronal fatigue construct attractive is the way it can be used to explain the reminiscence data. The difficulty of the reminiscence data lies in the asymmetry of the effects; high arousal appears to strengthen learning, but only in the long term. If high arousal always lead to improved recall performance versus low arousal than it would be simple to create a model that directly links learning and arousal. The low arousal effects are easily explained because they fit the predictions that most models would make. The central piece of a reminiscence theory, therefore, must afford an explanation of why short-term recall is poor when arousal is high. Further, such a theory must do so in such a way that short-term recall will still be high when arousal is low.

To build such a theory a reasonable starting point is to examine the effects of arousal on the cognitive system. The major effect appears to be that neural activity becomes more intense and concentrated (Oades, 1985). One conclusion that could be drawn, therefore, is that learning strength is related to the intensity of activity. Further, one might speculate that something about the intense activity leads to a situation where short-term recall is poor. Fatigue provides a simple explanation as to why this might be the case. The neurons in the areas of intense activity will naturally expend large amounts of resources and therefore become abnormally fatigued. Since these cells are fatigued they will be less sensitive to reactivation until the fatigue dissipates, and therefore the information that they code will be temporarily inaccessible. When arousal is low, on the other hand, activity will be less intense and not as much fatigue will build up. In such a case the information coded will be easily retrieved because of short-term memory considerations.

Putting the Data Together

The number of factors involved in reminiscence seems rather imposing. Designing experiments that completely isolate the various components is difficult at best. Unfortunately, in the bulk of the literature little heed is paid to the interactions of these components. Interference theorists, for example, may not acknowledge a fatigue component and therefore would design their experiments concentrating purely on interference. (Actually, fatigue can be viewed as a form of self inhibition, or reactive inhibition as it is usually called.) Even aside from the three mechanisms already mentioned there are still others. As discussed earlier, the reminiscence data points to an arousal factor. Most models take at least one of these components as a major tenet and combine them with some notion of consolidation. The results have been mixed.

The contribution of the TRACE model

Consolidation and Reminiscence

One model does exist which purports to account for all of the consolidation and reminiscence data - it is the TRACE (Tracing Recurrent Activity in Cognitive Elements) model (Kaplan, et al., 1991). The TRACE model is actually quite similar to the explanation originally put forth by Kleinsmith and Kaplan for the reminiscence data. Their explanation relied on the fact that events are represented in the brain by neural circuits. Under conditions of high arousal such circuits would become highly fatigued and therefore would be difficult to reactivate thereby causing poor short term performance. Conversely, under conditions of low arousal there would be little fatigue and, due to short term memory effects, performance would be good. The neural circuits referred to by Kleinsmith and Kaplan are called cell assemblies in the TRACE model as TRACE is an updated version of Hebb's cell assembly theory.

The assumption that cell assembly theory is built upon is that a given thought corresponds to a particular firing pattern in a collection of neurons. These neurons, because they are strongly connected to each other, tend to form a unit; when some portion of them become active they all become active. The activation of such a unit corresponds to whatever that unit, or cell assembly, represents being perceived. TRACE models the dynamics of an active cell assembly.

In TRACE, learning is based upon a variation of the rule that Hebb proposed for his own cell assembly model (1949).

Whenever an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A's efficiency, as one of the cell's firing B is increased. (p. 62)

The result of this learning rule is a direct correlation between the activity of a cell assembly and learning. As long as the cell assembly is active there will be Xs firing Ys and therefore learning will be taking place. Therefore in the TRACE model the time course of consolidation is exactly the same as the time course of activity of the cell assembly. This time course can be interrupted, therefore interfering with learning, and it can be intensified, therefore enhancing learning.

The TRACE model also incorporates all of the theoretical factors that appear to be important in modelling consolidation and reminiscence effects. TRACE (or more properly SESAME, the cognitive architecture of which TRACE is a part) includes several forms of inhibition, a mechanism for short term memory, and fatigue. All of these mechanisms are theoretically derived in TRACE and each has predictable effects on the time course of activity of the cell assembly modelled in TRACE. Fatigue, for example, is required in a cell assembly model in order to ensure that a cell assembly does not remain active indefinitely. Short-term memory, on the other hand is one side-effect of short-term connection strength, a factor in the TRACE model which functions both to provide short-term memory and also to provide cell assemblies with a temporary boost that allows activity to become strong enough that the cell assembly can sustain itself through reverberation. Inhibition comes in several forms and can serve to shut off cell assemblies under conditions such as perceptual competition and attentional shifts. While it would be wrong to completely separate any of these factors from learning, the derivation of how they impact activity in TRACE does not rely upon any of the evidence reviewed in this dissertation (though some of this evidence is used as support). Fatigue, for example, is not an ad hoc mechanism needed because of the reminiscence data, but rather has a theoretical meaningful role in the time course of activity; that it provides an explanation for reminiscence merely lends further credence to its existence.

On the other hand, the original TRACE model did not include arousal. The reason for this is that arousal does not have a general, predictable, role in the time course of activity of a cell assembly. A cell assembly will behave in different ways based upon whether arousal is high or low. TRACE models the generalized - ideal - case. Further, arousal, unlike fatigue, is a well documented physiological construct so although it may not be necessary to include arousal when modeling the general case of the time course of activity, it is necessary to include it when modeling particular domains, such as the reminiscence data.

Chapter 3

Credit assignment

Given the hypothesized relationship between consolidation and learning, a basic model of human credit assignment can be abstracted. In its simplest form such a model would posit that the representation for a given event consolidates for up to five seconds. During this time associative linkages can be made to the representations of other consolidating events. The strengths of these linkages are determined by the temporal overlap of the consolidation periods of the representations. For example, an event would be more strongly linked to another event which followed one second later than one which followed three seconds later because the consolidation periods have more overlap with a one second time interval. Further, this five second time interval can be impacted by factors such as attention, arousal and context . In this way the five second interval acts as a general heuristic which can be flexibly adjusted according to the situation.

The question remains, however, as to whether this five second period is long enough to account for the power and diversity characteristic of human learning. Because the emphasis in consolidation research has been on determining the nature of what consolidation is, little attention has been given to how consolidation fits into the larger framework of learning. Fortunately there has been work in the machine learning community which indicates that a consolidation-style model is potentially quite powerful in its ability to learn. In particular, Sutton (1988) has studied similar issues. He calls models which address these issue temporal difference models and the basis for the relative power of a temporal difference model is directly related to the length of the temporal interval it uses. To understand this work is it helpful to examine the role of time in credit assignment.

Credit assignment and time

Time adds a great deal of uncertainty to the credit assignment problem. There are a number of reasons for this. First, since the effect of an action is not necessarily immediate, the length of time between the action and when its effects are finished cannot be known without a model of the action. These delays make it impossible to know which action is responsible for which outcome when there are intermediate actions before the ultimate consequence of an action. The intermediate actions add further uncertainty because they too may impact the consequence of a previous action. Without a domain theory to sort such issues out there is no way to equate actions and outcomes with certainty. Since a domain theory presupposes knowledge, in the general case when a domain theory is not available, time would appear to make the credit assignment problem intractable.

Fortunately, however, the real world is not quite the general case and certain heuristics, while not providing a "solution" to the credit assignment problem, can afford effective methods for building domain knowledge. These heuristics could be considered as a kind of domain theory of the world as a whole.

The starting point for such heuristics is determining what a reasonable temporal interval might be between actions and consequences. A simple heuristic would be to assume that actions do have immediate consequences. Such a heuristic could be called a contiguity rule, because it assumes that things that are next to each other (in time) are related. Indeed it is the case that some type of contiguity rule is a part of nearly every learning theory. At the other end of the temporal spectrum would be a heuristic which uses extremely long time intervals. In a domain such as a chess game, for example, a learning system using this type of heuristic would wait until the end of the game before assigning credit to individual moves; by contrast a contiguity-based system would assign credit from move to move. The advantage of using long intervals is that it virtually assures that the consequences of an action will be finished before credit is assigned. On the other hand, the problem of intervening actions becomes very large. In chess an individual move may be perfect for the situation in which it occurred, but the game may be lost anyway. With a long interval scheme, that move would be assigned blame equally with all of the other moves even though it might have been the best move in the game. On the other hand, over the course of time good moves should participate in wins more often than poor moves and vice versa. Short intervals, by contrast, avoid the intervening action problem, but cannot easily handle actions with delayed consequences. Short intervals also require far less storage since only a few things at a time are linked. Of course intermediate length intervals are also possible.

Temporal difference analysis

In Sutton's analysis (1988), a generic temporal difference system makes a prediction and then at a specified time interval later updates that prediction based upon the new state of the world. It then will update the mechanism used to make the original prediction based upon the new information. In chess, for example, the prediction might be whether or not the current position will lead to a win, or what the best move in a situation is. Several moves later the system makes a new estimate and updates the evaluation function responsible for the original prediction, the idea being that the later evaluation should be more accurate. Examples of temporal difference systems include Samual's checkers playing system, Holland's Bucket Brigade algorithm, and Sutton's own Adaptive Heuristic Critic. Sutton's work examines the issue of the temporal interval size in some detail. Sutton calls systems that use a maximum interval size supervised, implying that by waiting for final outcomes these systems are effectively getting perfect information as if from a teacher. This may be possible in games, but it is not necessarily possible in real world situations. What is surprising, especially since Sutton only used situations with well defined outcomes, is that he found that temporal difference systems, particularly ones using differences of only a few time steps, clearly outperformed supervised systems on learning tasks. Performance, in this case, was measured by how long such a system took to converge to an optimal solution. Another researcher, Tesauro, interested in computer backgammon and skeptical of Sutton's results, applied a simple temporal difference model to backgammon. Tesauro's system, which used very simple board representations and started out with no knowledge whatsoever, was able to learn to the point where Tesauro judged it superior to every machine backgammon system, including systems that had been trained on massive human expert data sets (Tesauro, 1992).

The advantage found in temporal difference systems using shorter time intervals appears to stem from the intervening action problem. Essentially what such systems do is build causal sequences. When a chess game is lost, for example, the loss will only affect the most recent few predictions. However, the next time the states where these predictions were made are reached, the negative effects of the loss will propagate further backward to other states (Figure 3.1). Rather than predicting what the end result of a game will be, they start out by predicting the state of the game a move or two later. As learning progresses

 



Figure 3.1: In a) the sequence A - B - C - D leads to a loss. In a simple temporal difference scheme at the next occurrence of D the loss will be predicted as in b). Further, this prediction will then be propagated back to C such that C predicts a loss.

and the predictions become more accurate, then the predictions will begin not only to predict the state of the game a few moves later, but also the final state as well. In such a system the individual sequences (of only a few steps), become akin to building blocks. When an individual sequence is learned well enough, then it can begin to function as a sort of intermediate goal. For example a particular board position in chess may be as good as a win because it will always lead to wins. Further, though it might take longer than with supervision, short intervals can also effectively capture relationships between actions and outcomes which do take long periods of time because of this building block function (Figure 3.2).

Pure temporal difference analysis does not necessarily apply to the real world, but the principles remain the same. The first problem with applying temporal difference models to the real world is evaluation. Whereas domains such as chess have clear evaluations, most real situations are not so clear cut. Second, temporal difference models assume that every prediction along the way is for the same event and therefore do not directly apply to situations where multiple goals are being pursued. Third, some of them require associating knowledge with specific states, thereby necessitating that the system can track all of the states of the world. Nevertheless, theories of human behavior such as drive-reduction or pleasure maximization can be cast in temporal difference terms. In such theories the evaluation functions are pleasure and pain and every prediction is made in order to maximize pleasure and reduce pain. Further, temporal difference analysis shows the potential advantages of short time interval heuristics.

While the temporal difference construct is framed purely in terms of machine learning, it is quite similar to the analysis of animal learning done by Hull (1943). Hull argued for exactly the kind of sequence chaining described here. Further, in reviewing the evidence for such a theory, Hull concluded that the temporal intervals involved in linking the elements of the chain were extremely short, on the order of a few seconds. Hull's theory also extended to the kind of backward chaining shown in Figure 3.1, for which he



Figure 3.2: In this series of figures state A leads to a loss, though there must be several intervening moves as shown in a). In b) and c) several occurrences of the sequence lead to a configuration where A predicts a loss. In d) when B, C, and D are preceded by a different state they lead to a win. In such a case the predictions of C and D (for example) would be updated to reflect that they no longer necessarily lead to a loss. This is shown in e) which just shows the predictions made at each state. Note, however, that X will predict a loss because it leads to states which had previously predicted a loss.

used the more formal psychological term secondary reinforcement. As an example of secondary reinforcement, Hull used the original experiments which were carried out in Pavlov's lab. In this experiment dogs were presented with the ticking of a metronome for approximately a minute and then a few seconds later the dog would be given meat powder. Eventually the dogs would learn to salivate at the sound of the metronome; a reaction Hull calls an ordinary "first-order" conditioned reflex. Next the dog would be presented with a black square and then shortly thereafter the sound of the metronome again. After a number of presentations the dog will begin to salivate at the sight of the black square. For Hull this is an example of a "higher-order" conditioned reflex.

In terms of timing from a credit assignment perspective, therefore, it would appear that relatively short time intervals are superior for learning. Further, in reviewing evidence on conditioning, Hull (1943) has concluded that animals also build sequences using short time intervals, later determined to be on the order of five seconds (Hilgard and Bower, 1966). All of these factors lend further credence to the supposition that consolidation can provide a powerful basis for learning.

Human credit assignment

Factoring the needs of a biological organism into the credit assignment question adds a number of constraints to the credit assignment question. These needs appear to have biased humans away from a rigid approach, but toward a more flexible learning system sensitive to factors such as importance and recency. These biases in learning are often interpreted as being flaws (Nisbett and Ross, 1980; Allman, 1985), because they are departures from pure rationality, but when viewed from an adaptive perspective they appear to be reasonable adaptations to the difficulties of survival. Combining them it is possible to draw a general portrait of human credit assignment.

The basis for human credit assignment appears to lie in learning sequence. A learning rule for learning sequence is based upon contiguity; linkages are made between things that are experienced close together in time. Aside from the apparent theoretical advantages of such a system, there is psychological evidence that humans learn in such a fashion (Hull, 1943; Hilgard and Bower, 1966) as well as the evidence provided by the consolidation data. The learned sequential relationships are not simple, however, but are weighted by a number of factors relating to adaptive issues such as safety. Among these are repetition, a factor which emphasizes the familiar over the novel, and importance, a factor which presupposes that the organism is capable of differentiating the relative degree of importance in certain key situations and which appears to be related to arousal.

Contiguity, repetition and importance will be emphasized throughout the rest of this dissertation, both at the molecular, the level of neurons, and the molar, the level of behavior, levels. The next chapter presents a model which not only accounts for consolidation, but which also specifies the role of contiguity and repetition at the molecular level; in this case consolidation is the molecular repetition of a contiguity learning rule. A significant portion of the dissertation will be devoted to showing that the learning system gains a great deal of its power by its ability to vary the amount of repetition of neural firing, which automatically impacts the strength of learning.

Aside from specifying the model in detail, later chapters will deal with how the human cognitive architecture as a whole can automatically detect importance. The result is a learning system which implements a domain independent credit assignment system thus affording the cognitive system enormous flexibility in learning.