Human Brain Imaging During Musical Activity

Human Brain Imaging During Musical Activity

by John T. Tennison, MD, Principal Investigator and Postdoctoral Fellow in Human Brain Mapping

Research Imaging Center, University of Texas Health Science Center, San Antonio

General Considerations When Designing Experiments to Image the Brain During Musical Activities

There are several ways that we could think about patterns of brain activity associated with musical tasks. For example, if I were going to design an experiment to image the brain during a task involving rhythm, I could frame the task in any one of the following six ways:

1. The activity pattern associated with a discrete change, involving a comparison of two features we are attending to. hublot replica sale An example would be comparing one discrete rhythm to another discrete rhythm, with a possible pause between the two rhythms. This might be an extreme case of number two below, and might yield similar activations.

2. The activity pattern associated with a gradual change in a feature we are attending to. hublot replica An example would be an ongoing rhythmic motif that gradually changes over time, rather than an abrupt stopping of a rhythm, followed by another comparison rhythm, as in case number one.

3. The activity pattern associated with the complexity of a feature we are attending to. rolex replica An example would be having a rhythm become successively more complex in terms of how many sub-beats can occur in a given period of time. That is, if I use divisions as small as 16th notes and 16th rests I can construct a more complex pattern as compared to only using 8th notes and 8th rests.

4. The activity pattern associated with a defiance of expectations about how a feature will change. fake rolex sale An example would be hearing a different rhythm than we expected to hear given the syntax of the established context or given a mismatch between a written score and deviant performance of that score.

5. The activity pattern associated with the analytical discrimination of a feature. Such discrimination tends to occur more often when a feature first comes into awareness, but can occur even if a feature has been repetitively present for some time. An example would be ascertaining analytically or reductionistically the nature of a rhythm. fake watches This process is often more intellectual, logical, linguistic, quantitative, or visual than number six below. It might involve counting, visualizing musical notes or geometric representations of musical notes or events. After making initial discriminations, many listeners move toward listening more holistically as in number six below.

6. The activity pattern associated with holistic experience of the feature (known by some expert listeners as a "flow state" or being "in the zone"). An example would be allowing yourself to experience the gestalt or overall feel of a rhythm, or go into a trance state as you let yourself metaphorically "become one" with the heard rhythm. Some people go into this state of perception first or as a default state, or sometimes such a state comes after an analytical discrimination phase, especially for those who are in the habit from training (or otherwise) to analyze what they hear.

In all of these six examples, a researcher might claim that he or she was studying what the brain does during the "perception of musical rhythm." However, in each of the 6 cases above, the subject is engaged in different activities, even though they all relate to "rhythm." The possibility that aspects of musical rhythm alone could possibly have six or more different patterns of brain activity depending on which of the above six tasks was involved points to the complexity of studying the brain during musical activities. When we take into account other musical qualities (melody, harmony, tonality, timbre) that could be measured in any one of the above six ways, the possibility for differing brain activations increases dramatically.

Dealing with the Problem of Categorizing a Subject's Extent of Musical Ability

Many studies of brain imaging divide subjects into subgroups that depend on a subject's extent of known musical experience or formal training. Such studies stand the risk of defining someone as a "musician" despite the possibility that someone labeled as a non-musician might have superior musical ability, but not be labeled as a "musician" simply because of a lack of formal training. The ONLY full-proof way to deal with this problem is to use some form of standardized testing to objectively assess each subjects music ability across several modalities: rhythm, harmony, melody, etc. This approach is more time consuming, but if it is not done, we stand the risk of measuring brain activations in highly-musical brains, yet mistakenly reporting the activations as being from "non-musicians," or measuring activations in relatively unmusical brains and reporting such activations as being from "musicians."

Comparisons of Language and Music

Semantics

There is much variability in opinion as to how or whether ideas and terminology from linguistics apply to music. For example, to the average linguist, "semantics" might refer exclusively to meaning as conveyed by a word or by a group of words. However, to a philosopher or cognitive psychologist, semantics might refer to the study of meaning in general, and of the ability of ANY object, stimuli, or subjective experience to "mean" something, regardless of whether or not this chain of meaningfulness contained linguistic content. I personally prefer this second, more general definition for the word "semantics." Moreover, if we adopt this second definition for semantics, we can surely say that music possesses semantic content.

For example, every time I hear the theme from James Bond, I think of images of James Bond films. It is true that such images might not be specific from one person to another, but this is only because we have heard the theme accompanying many different images. The theme just as easily could have been associated with a single distinct image that would be as specific and precise as the definition of a word, with the result that the theme could potentially conjure up a very specific image across many subjects. Moreover, the James Bond theme elicits recall of James Bond images without any thought or experience of words, language, or linguistic processes. That is, semantic recall can occur with no involvement of language or words. Thus, if semantics is defined as the study of meaning in general, then meaning appears to refer simply to the associations between entities, be they linguistic or not.

The word, "meaning," always implies the involvement of two or more entities, concepts, or objects. That is, one entity, concept, or object cannot have meaning unless it is associated or related to at least one other entity, concept, or object.

Semantics also relates to the concepts of "familiarity" and "recognition", in that the more familiar we are with an entity, the stronger will be its associations, and thus its meaning. However, this is not to say that semantics is synonymous with "familiarity" or "recognition," in that we can recognize an object as familiar (A.K.A. the deja vu experience) without necessarily relating or associating it with another entity. For the sake of this paper, I will restrict my definition of semantics to involve an association between one recognized entity and one or more other entities. Thus, for a semantic process to occur, recognition and familiarity must first be present, followed by an association, at which point "meaningfulness" has occurred.

Moreover, the magnitude of "meaning" is proportional to our ability to recognize current stimuli/experiences and is proportion to the subsequent relationships to past stimuli or experiences that result. Therefore, another very general way of defining semantics would be to say that semantics deals with the association of one experience with one or more other experiences.

Perhaps recognition is nothing more than having sufficiently strong brain activations in certain areas that are already active, even with non-novel stimuli. That is, perhaps, there is no additional neural analysis or circuits involved with "recognition", but simply a greater magnitude of activations in the areas that are already active when we are presented with a novel, unrecognized stimuli. However, this still leaves open the question of how certain entities are remembered as being related only to specific other entities.

Brain areas implicated in semantics, and with the experience of "familiarity" and "recognition" include: Left and Right BA 45 and 46; Left BA 47; Left BA 19; Left BA 24; and Left and Right BA 22. Future imaging studies might be able to test the idea of whether the experience of "recognition" or "familiarity" is a distinct process from the association or relating of the recognized entity to other entities.

The word "content" often implies semantics or meaning. Ultimately, what words "mean" can only come from associating them with experiences. Associating words with each other is only an abstraction that does NOT progress across space or times to the original experience or group of experiences that have given a word is meaning, which is only to say its associations.

Grammar

Other linguistic terms that might be applied to music include "grammar" and "syntax." To determine the extent to which these terms apply to music, we might start with a layman's definition found in the American Heritage Dictionary.

The American Heritage dictionary has several definitions for grammar, each of which can be related to music. The first definition of grammar is "The study of how words and their component parts combine to form sentences." Certainly, such structural relationships exist in music. For example, we can consider at the scale of individual sine-wave frequencies, we could say that such frequencies combine to form the timbre of notes; these notes combine together to create larger structures of phrases over time and harmony over space. Phrases combine to create melodies with their respective harmonic accompaniment; and a given melody (with its associated harmony) can be a part of a potentially infinite work of music.

Just as in language, there are expectations and conventions as to how one might combine musical elements to arrive at more complex structures. These expectations and conventions are determined in large part by environmental factors, including cultural norms. However, many have theorized that biological influences probably have a strong effect on how we decide to combine musical elements. For example, it is clear that the first 16 partials in the natural overtone series (integer multiples of a given fundamental frequency), can combine to create harmonies that are conventionally considered desirable or pleasing in Western music. It is possible that we have evolved as a species with a built-in preference for combinations of these frequencies, especially if the recognition of these combined frequencies resulted in increased survival and reproductive potential. Moreover, there are also conventions and expectations as to desirable intervals in melodies and desirable rhythms for melodies. Thus, just as in language, music is filled with particular structural expectations.

The second definition used by the American Heritage Dictionary for grammar is "The study of structural relationships in language or in a language, sometimes including pronunciation, meaning, and linguistic history." This more general definition can also be related to music. For example, there are expectations as to the "right" or conventional way to execute a played note on a given musical instrument. Moreover, I have already discussed the semantic content possible in music. Lastly, music theory is really the study of history of the structural relationships of previously-written music.

The third definition used by the American Heritage Dictionary for grammar is "The system of inflections, syntax, and word formation of a language." Once again, we can see how music has its share of expectations regarding how notes or other musical elements are inflected. And once again, word formation can be thought of as analogous to elements in music combining to form more complex structures. With regard to "syntax," see he section on syntax below.

The fourth and last definition used by the American Heritage Dictionary for grammar is "The system of rules implicit in a language, viewed as a mechanism for generating all sentences possible in that language." Once again, if we restrict ourselves to a given set of note frequencies, number of notes, and possible note lengths, we can generate every possible melodic permutation possible.

Thus, it seems clear that music has its analogues for every aspect of grammar in language. Whether the same patterns of brain activity occur during the analogous musical process remains to be seen.

Syntax

If we turn to layman's definitions for syntax, we find that the first definition used by the American Heritage Dictionary for syntax is "The study of the rules whereby words or other elements of sentence structure are combined to form grammatical sentences." As previously shown in the grammar section, music has many examples that reveal how simpler entities are combined to form more complex structures.

The second definition used by the American Heritage Dictionary for syntax is "A publication, such as a book, that presents such rules." Obviously, this definition presents no challenge to being applied to music.

The third and fourth definitions used by the American Heritage Dictionary for syntax are "The pattern of formation of sentences or phrases in a language" and "Such a pattern in a particular sentence or discourse." These definitions continue to point to the idea of structure or pattern. Just as in a work of language, a work of music has a overall structure or pattern that can be analyzed in numerous ways.

The fifth definition used by the American Heritage Dictionary for syntax comes from Computer Science. It is "the rules governing construction of a machine language." Once again, a given style of music has a set of implicit rules which are consciously or unconsciously utilized in order to create a new piece of music which still sounds like it has been written in the style intended by the composer. In fact, if the rules were not followed to a high-enough degree, a given piece of music would not be identifiable as belonging to a certain style or genre of music

The sixth and final definition used by the American Heritage Dictionary for syntax is "A systematic, orderly arrangement." Obviously, music can be and usually is arranged in a systematic and orderly way.

The Intersection of "Grammar" and "Syntax": Structure and Expectations

As can be seen from the layman's definitions of grammar and syntax, both are often associated with the idea of an overall pattern which is built from simpler elements. Moreover, both syntax and grammar are often associated with the idea that there are conventional expectations as to what the simpler elements will be or where these simpler elements will occur within the overall structure of a given musical style or cultural context. Since music possesses all of these structural qualities and expectations, I cannot think of any reason to say that music does not have syntax and grammar. Moreover, even when a single musical element of rhythm, melody, harmony, tonality, or timbre is isolated, it can still maintain syntax/grammar and semantic qualities. Moreover the grammar/syntax and semantics of a given musical element can be manipulated independently of other musical elements.

For example, rhythm by itself has a grammar/syntax. That is, we have expectations about what the appropriate structure of rhythm should be, given a particular context. For example if I collect together a set of musical notes and have occur at certain places in time, this pre-existing structure implies appropriate places for the occurrence of additional notes that I, as a composer, might choose to add; the locations of the pre-existing notes also suggest how their positions might be varied. Furthermore, the structural context of a given tempo and time signature (meter) suggest places where notes could occur without sounding out of place. We have all had the unpleasant experience of listening to an ensemble where the placement of one player's notes were either not "tight" with the rest of the ensemble, or in the case of a complete beginner, where the notes played were drastically off from where they seemed they should have occurred. The fact hearing such performances can be so dis-concerting (no pun intended) is good evidence for how strong our expectations of rhythmic structure are.

Moreover, if we were to consider harmony alone, the existence of one pitch implies other pitches which would be expected to harmonize well with this pre-existing pitch or "note". With regard to tonality, Karol Krumhansl has quantified the fact that tonal contexts establish certain expectations, which result in one note of a diatonic set (scale) being perceived as more or less stable when compared to other members of the scale. Lastly, a given timbral context can certainly establish an expectation of what other timbres might be appropriate to add to the first timbre; or establish expectations as to how the first timbre could be varied. Obviously, any of these musical elements could come be involved in a semantic process in the way previously discussed for the theme from James Bond. That is, through the proper environmental training, musical sounds could come to have very specific meanings, so that hearing a particular snippet of music would have a very precise semantic content. In this case, it seems we would have a piece of stimuli that would be functioning simultaneously as both language and music. Moreover, any language can be encoded by any combination of rhythm, harmony, and/or melody. For example, Morse code is a purely rhythmic encoding of language. That is, "meaning" or "semantics" is conveyed entirely by rhythm.

Thus the three terms, semantics, syntax, and grammar, can not only be applied to a music piece as a whole, but also to the individual elements of music, such as rhythm, melody, harmony, tonality, and timbre.

Structure in General

As previously stated, syntax/grammar relate to how well a set of objects go or integrate together to form a whole. In fact, "structure" only comes about because one object is relating to another object or objects. It makes no sense to speak of the structure of a single entity, because structure always implies reduction into parts, OR implies how that object relates to other objects.

The Idea of "Rule-Governed" Systems

Another idea associated with grammar/syntax and expectation is the idea of being governed by "rules." For example, it is often said that language is "rule-governed" and that music has its analogous moments when it is governed by rules. However, when we say that something is "rule-governed" we suggest that there are a discrete set of rules that exist unto themselves. In many instances, however, the rules are unknown, or might not exist at all. The speakers of language or performers of music are not aware of following rules. In fact, if someone is improvising music, there might not be any rules derivable at all. As a performer falls into a set of habits or trends or repetitive elements, rules become derivable. Thus, rules are derived from past repetitions or periodicities. Without prior repetition, there could be no rules derived? For this reason, the term "implicit rules" is sometimes used. Still, it might be better to say that language and music have "conventions" which result in expectations, rather than "rules." Saying that something has implicit rules is really another way of saying that the system has certain habits or trends in its present behavior. For example, music theorists often derive "rules" after having analyzed the trends in a style or collection of music that was composed or performed in the past, sometimes when their was no intentionality to consciously follow a set of "rules." "Rules" are often derived generalized statements about what has already occurred. Thus, if a new style of music is occurring in the present, the "rules" can often not be known until a sufficiently large and consistent sample of that style of music has been produced.

Music can be thought up, or improvised on the spot and still be considered "music." This is especially true if we adapt the liberal definition of "music" associated with John Cage. That is, that music is simply the collection of sounds that occur when we choose to listen to our environment auditory stimuli as "music," as opposed to random noises. This is one way where music can divorce itself from the properties of semantics, syntax, and grammar. If someone were to try making up language on the spot, it would cease having its predictable semantics, syntax, and grammar, which are required to satisfy the purpose of communication of specific or relationships to specific entities.

Generalizing Linguistic Terminology

Interestingly, the terms "semantics," "syntax," and "grammar" appear to be applicable to numerous other entities besides language and music. For example, although music and language occur in the auditory domain, there is no reason at all why semantic, syntactic, and grammatical qualities cannot be described for visually-perceived entities, such as the geometry of a pleasing style of architecture. In fact, we can generalize semantics, syntax, grammar to ANY sensory modality, any emotion, any experience, or any non-linguistic cognitive process. For example, there is syntax/grammar of social relations, such as the etiquette popularized by Emily Post. That is, certain behaviors are considered to be appropriate or meet expectations in the context of a given social situation.

Other examples of visual syntax/grammar could include gestalt expectations of completion; what clothes match well; what style of building might go well with other buildings. The list is clearly endless. Obviously, we can think of tactile grammar/syntax, such as what is considered appropriate ways of being touched by others. We can also imagine a grammar/syntax for smell. Certain odors are to be expected in the bathroom. These same odors would not be expected or welcome in our refrigerator.

Review of Neuroscience Literature Pertaining to Harmony and Tonality with Suggestions for Future Studies

Patel's 1998 Study

Patel (1998) used musical stimuli involving 3 measures of chords in a given key. The progression of chords are all within the beginning key, with the occasional exception of the first chord of the second measure, which can be in one of the following three keys:

1. The same key as the rest of the chords, providing the greatest degree of satisfaction of listener expectations.

2. A nearby key, which only partially satisfies listener expectations.

3. Distant Key, which is the least expected, and thus satisfies listener expectations to the least degree.

Thus, Patel has established three gradations of satisfactions of tonal expectations.

Patel's Studies (1998?)

Patel: "The principal finding was that the late positivities elicited by syntactically incongruous words in language and harmonically incongruous chords in music were statistically indistinguishable in amplitude and scalp distribution in the P600 latency range (i.e. in a time window centered about 600 msec postarget onset). This was true at both moderate and high degree of structural anomaly, which differed in the amplitude of elicited positivity. This strongly suggests that whatever process gives rise to the P600s is unlikely to be language-specific."

It also is possible that the P600 is not specific to either language or music. One possibility is that it is a general indicator of non-congruity.

In his language experiment, Patel uses three conditions of increasing integration difficulty. In each of the conditions, words seem increasingly out of place, but the overall semantic content appears to remain intact.

Patel: "There is a P300 phenomenon where physically odd elements (such as a rare high tone in a series of low tones.) a positive-going waveform of shorter latency (but similar scalp distribution) to the P600. The P600 could be a type of P300 whose latency is increased by the time needed to access stored structural knowledge."

Patel: "Interesting and unexpected subsidiary finding of this study was that harmonically unexpected chords in music elicited a right antero-temporal negativity, or RATN, between 300 and 400 msec posttarge onset. Although different from the better known negativities elicited by language stimuli ("The semantic N400"), the RATN is somewhat reminiscent of the left anterior negativity, or LAN, a hemispherically asymmetric ERP component associated with linguistic grammatical processing."

Patel: "As neural studies of syntactic processing proceed, the question of specificity to language should be kept in mind because language is not the only domain with syntax."

This last statement by Patel is consistent with my previous examples, in which I have shown that numerous other entities besides language and music can have syntactic and semantic content.

Mireille Besson's Studies

Besson: "emitted potential is elicited by the omission of both an expected word within a sentence context and an expected note within a musical phrase." Is it a P600? or something distinctive?

Mireille Besson (CNRS, Marseille) (who will present at the Satellite Symposium of the Annual Conference of the Organization for Human Brain Mapping) states:

"Taken together, these results suggest that while processing the semantic aspects of language requires computations that are specific to the language domain, other aspects of language processing, such as syntax and prosody, depend upon some general principles of human cognition." Like Patel, Besson notes that the P600 ERP component is elicited by both syntactic and harmonic incongruities. Has anyone done an experiment to check for the possibility that the P600 component is perhaps involved with perceived incongruities in general, and not just those confined to language and music?

I disagree with Besson that music does not contain semantic content. Besson states that N400 is elicited by semantically incongruous words. Has anyone checked to see if an N400 can be elicited with semantic incongruities in music. For example, if I played "James Bond" theme while showing a subject pictures from Star Wars, might I get something similar to an N400?

Besson states that the P600 appears with the presentation of wrong notes and wrong chords in a musical context.

Herve Platel's Study

Platel measured areas of activation for 4 different conditions: each task had 30 sequences, 15 of which had a changing element, 15 of which that same element did not change.

Platel's "Familiarity" Condition

    Stimuli: 15 sequences with semantic anchorage or clue in the form of a melodic contour, a rhythm, or both (most information):

    Activations: Familiarity (minus? vs.) pitch and rhythm:

    Left inferior frontal gyrus: BA 47
    Left and right superior temporal: BA 22
    Left middle occipital gyrus: BA 19
    Left anterior cingulate: BA 24
    Right internal pallidum: BA

Platel's "Change in Timbre" Condition

    Stimuli: Fifteen sequences: some containing a change from one brighter timbre to another in alternating notes and other sequences which involved all the same timbre.

    Activations: Timbre vs. (pitch and rhythm):

    Right superior and middle frontal gyrus: BA 32/8
    Right middle frontal and precentral gyri: BA 4/6
    Left precuneus: BA 7/19
    Left middle occipital gyrus: BA 19
    Had most significant activations in right hemisphere.

Platel's "Change in Pitch" Condition

    Stimuli: Fifteen sequences involving a change in one pitch to another pitch

    Activations: Pitch vs. (timbre and rhythm)

    Left precuneus/cuneus: BA 18/19
    Left superior frontal gyrus: BA 9/8

Platel's "Change in Rhythm" Condition

    Stimuli: Fifteen sequences containing a rhythmic irregularity

    Activations: Rhythm vs. (pitch and timbre)

    Left insula: BA 13-16
    Left inferior Broca: BA 44

Among other things, Platel studied the pattern of activation associated with a change in timbre. His experiment was very simple, in that it used only two timbres and alternated between them.

However, the activations he saw could have just as easily been that associated with the differences in the degree of timbral complexity present. That is, the "change-in-timbre" activations might not be identical with those associated with the degree of timbral complexity, i.e. number and amplitude of overtones in a given stimulus. That is, the perception of the degree of timbral complexity is not the same as the perception of a change in timbre. Thus, I propose a parametric study where the stimuli are varied from simple sine waves (the simplest timbre possible), and then through harmonic synthesis, adding one overtone at a time in standard integer multiples of the harmonic. It is probably a good idea to NOT diminish the amplitude of the added overtones relative to the fundamental, as this will result in the greatest perceived difference in timbres with many harmonics as compared with those with few harmonics or just the fundamental. It might be wise to use only the first 16 harmonics, as they are easily generated and approximate our own chromatic divisions of frequency, and since they are louder in natural sounds on average than the harmonics that come after them.

It is important to remember that timbre is related to harmony, as timbre usually involves a series of overtone frequencies which fuse together to form the perception of a single pitch. Whereas, harmony results in the way different series of overtones together to form harmonic moods, and for some (usually untrained ears) a single pitch, most often the pitch of the fundamental of the highest note. The degree of fusion for overtones or for member notes of a chord varies from individual to individual and, through training, individuals can come to perceive individual pitches where only one pitch might have previously been perceived. The best-trained ears can consciously choose to attend either to the fused composite of a set of stimuli (i.e. listening for timbre), or to the individual frequencies present in a stimulus (i.e. listening for harmony).

Brain, 1998 article: by Catherine Leigeois-Chauvel, et. al.

Findings:

Unilateral temporal cortectomies for intractable epilepsy:

Right temporal cordectomy: impairs contour and interval information in perceiving melody.

Left temporal cordectomy: only interval information impaired.

Thus, the underlying significance of superior temporal gyrus for melody on BOTH sides.

In temporal dimension, we found dissociation between metre and rhythm: critical involvement of anterior part of STG in metric processing.

Peretz (1990) stroke patient study showed a bias towards pitch processing in the right hemisphere, however a substantial contribution from left hemisphere was documented as well.

Definition of Contour: the succession of pitch directions. It is a "digital" description involving only "Up" or "Down" relative to the note that occurred before the present note under consideration.

The Catherine Liegeois-Chauvel group made Several manipulations:

1. Contour Violation (melody violation) detection: affected by right-sided cordectomy, but not left

2. Key Violation

3. Interval Violation (with contour preserved) (melody violation) - affected by both right and left-sided cordectomy

4. Rhythm Violation

5. Meter Condition

Since interval violation detection can be interfered with by right or left cordectomies, it is clear that THIS aspect of melody processing uses both sides of the brain, and is not a predictably unilateral function.

There was no cordectomy that affected only contour while not affecting interval.

Peretz (1990)

Peretz put forth a model of hierarchal processing in which he suggests that contour processing first occur on the right side, and that this information is then passed to the left for interval processing. Thus, in this model, the right does not process the magnitude of intervals per-say, but interval processing can be interfered with by not letting the contour processing of the right side occur.

Maziotta: 1982:

Glucose metabolism goes up in right posterior-superior temporal region for discrimination of melodic sequences varying by a single note.

Zatorre, 1994:

Increased activation in right frontal and right superior temporal gyrus when exposed to pitch changes at beginning and end of melody sequences.

It is not the whole right temporal lobe, but the right superior temporal gyrus which is critical in melody processing, in particular the contour of melody. Moreover, posterior portion of T1 seems to be more important.

Parsons, Fox, and Hodges' Melody, Harmony, and Rhythm Study

Cerebellum: active during melody, harmony, and rhythm. That fact that it was active for melody and harmony is relatively new.

Subjects detected errors in the performance of the musical score that they were reading. What was heard defied what was expected. Did they know the kind of error that they were going to hear?

Score-reading and listening for melody, rhythm, and harmony were supported by processes in both cerebral hemispheres.

Melody activates each hemisphere equally. Perhaps the right hemisphere was processing contour, while the left side was processing interval magnitude.

Harmony and rhythm activates left more than right.

Melody, harmony, rhythm: all three strongly activated the cerebellum. 80% in lateral hemispheres; 20% in anterior hemispheres

How do we know there was not implicit movement? If not, then yes, cerebellum seems to be more involved in sensory/cognitive processing.

Rhythm condition: activated cerebellum at twice the level of harmony and melody. Once again, how do we know there was not implicit movement?

Cerebellar lesions impair perception of temporal order (rhythm?), i.e. does perception of rhythm always involve temporal order.

Melody, Harmony, rhythm: all activated a single area in right fusiform gyrus: visual processing of musical notes. (Left fusiform does visual word processing.)

All tasks activated BA 6, but at different Z heights in the brain: Rhythm was on top, harmony in the middle, and melody was inferior.

All three tasks activated inferior lateral frontal cortex (BA 44/45), with activations being stronger on the left. Activation here was strongest for melody, rather weak for rhythm, and least strong for harmony.

Inferior parietal (BA 40/39) activated bilaterally in each task. Right hemisphere: rhythm and melody conditions activated more strongly. Left hemisphere: harmony condition activated more strongly.

Melody condition: strong activation in bilateral superior temporal (BA 22) and bilateral, (strongly right), middle temporal areas (BA 21). Both are secondary auditory association areas.

No activation in temporal areas for Harmony.

Rhythm condition produced only minor left inferior (BA 20) and bilateral middle (BA 21) temporal activation.

Neuropsychology data show superior temporal lesions can impair melodic perception, but spare rhythmic perception. Parsons' data is consistent with this finding.

Melody condition: major activation in bilateral subgyral medial frontal area (BA 24/6), perhaps related to attentional processing. (But why would melody have attention processing here and not rhythm or harmony? Actually rhythm did have some activation in 24, but of lesser magnitude.)

Rhythm: only modestly-sized activation appeared outside the cerebellum. Rhythm areas outside the cerebellum bilateral Anterior cingulate (BA 24) - mostly left - likely related to attention.

Rhythm: also activated bilateral thalamus - possibly relaying sensory info between cerebellar and cortical regions.

Possible Future Studies on Harmony and Tonality

Unfortunately, the word "harmony" can be used in two senses: The first is when it is used to mean tonality, whereas the second is when it is used to mean the degree of harmony or dissonance among 2 or more simultaneously-sounded notes. It is this second definition which I favor for the word "harmony." When it comes to the expectations for the next chord in a harmonic progression, I believe the word tonality is less ambiguous.

For example, I could play a C# triad in the context of a C major piece. The C# triad would defy tonal expectations, i.e. it would defy expectations of tonality, but it would not be "dissonant." That is, in itself, it would be just as harmonious as a C-major chord, even though it defied our tonal expectations.

Thus, it makes sense to regard harmonic and tonal expectations as different from one another. Harmonic expectations deal with how well one or more fundamental frequencies blend with each other (i.e. the degree of dissonance or consonance of 1 or more notes sounded together, whereas tonality deals with how stable or unstable one or more notes is in a given diatonic context.

Thus, Patel was measuring tonality because the notes of distant-key chords were probably not dissonant with each other. That is, the degree of consonant probably did not change in the distant or nearby-key chords. . However, they were atonal, given the context in which they occurred. Thus, the changes in brain activity were probably related to tonality, and not harmony per se. However, to be certain, we should test single-notes at a time, rather than chords, for their pattern of activity. If we got specific activities for these, we could then say with more certainty that it is tonality and NOT harmony that is being imaged.

Moreover, we could use these single-note activations as high-level controls to be subtracted from the activations of dissonant chords. For example, if we sound two notes of C and C#, which is a dissonant harmony, we might be able to subtract out the activations for tonality of either C or C# alone, so that remaining activation was related to the presence of dissonance only, and NOT tonality. One questions that arises here is: Would we have to subtract out both tonal activations, or could we simply subtract out the highest-magnitude tonal activation so that the remaining tonal activation was so low in amplitude that it gets buried in the noise floor? Whatever the case, the general idea here is that tonality and harmony activations are distinct and probably dissociable through the right experimental design, which might possibly involve subtraction.

With regard to possible studies on tonality, the work of Krumhansl is informative. Since Krumhansl established and quantified ALL 12 chromatic members of the western tonal hierarchy in 1979, I suggest that we conduct a correlational study to determine if there is a correlation between brain activations and between the these known quantities. I detectable, I would hypothesize that the pattern of brain activity would be proportional to the degree of resolution possessed by a heard tone within a given tonal context. For example, in the key of C, C has the most resolution, whereas C# and F# have the least. Thus, I would expect brain activations to be proportional to these differences. One possible title for such a study could be: "A PET Imaging Study of Neural Activity When Perceiving Tonal Hierarchies (in Musicians and Non-musicians)"

As I end my commentary, I would like to return to the six considerations that I mentioned at the beginning of my paper. These principles apply not only to music, but to brain imaging in general. If "good science" is to take place, it is important that we are imaging what we claim to be imaging. Thus, any experimental design should always take into account which of these 6 types of brain activity is actually being measured. Once again, the six "task types" or "task classes" are: discrete change, gradual change, complexity, defiance of expectations, analytical discrimination, and holistic experience.

Happy imaging! -- John Tennison, MD