Cyclic groups in music

One of the key concepts in music is that of pitch of a note: how shrill a note sounds. Pitch is based on human sensation, but it relates to a physical concept -- the concept of frequency. The frequency of a note is the number of times the corresponding wave repeats every second. Shriller notes, like a high-pitched opera song, correspond to higher frequencies: the sound repeats more often per second.

At its crudest, a musical tune is a sequence of notes (of varying pitch) played for given lengths. The power of a musical tune, in this simple sense, derives from the way the pitch, or frequency varies. The patterns of this variation, and why certain manner of variation appeal more to the human ear, is somewhat mysterious. We'll see here how the theory of cyclic groups helps explain some of the msyteries.

Doubling the frequency
If the frequency of one note is double the frequency of another note, the two notes sound similar in a strong sense, similar, except that one is shriller. Musicians say that two such notes are separated by one octave, with the shriller (high-frequency) version being one octave higher. If we're singing a tune and reach a very shrill note, it's customary for us to move an octave down to continue the song. Similarly, if we reach a note that's too basal, we customarily move one octave higher to continue the song.

In Western music notation, this is, for instance, the difference between a $$C3$$ and a $$C4$$: the $$C4$$ is one octave higher than the $$C3$$.

Why should doubling the frequency give a similar sound sensation? This is best viewed by thinking of frequency in terms of the number of repetitions per second. Suppose one notes corresponds to a frequency of 760 repetitions per second, and another corresponds to 1520 repetitions per second. Then, if the both notes are sounded simultaneously, every repetition of the lower note corresponds to every second repetition of the higher note. Thus, the higher note contains the lower note in this sense, and so the sounds match up in some sense.

This frequency doubling effect can also be seen in light. Our visible spectrum of light ranges from violet (highest frequency) to red (lowest frequency). And red is approximately half the frequency of violet. That's why there is an uncanny similarity between violet and red.

Absolute and relative pitch
There is a crucial difference between the way we sense light and sound. With light, we are sensitive to absolute frequency: we can use the absolute frequency to judge the color of a beam of light. With sound, however, we're generally not good at sensing absolute pitch. It's hard to judge, by sounding a note in isolation, where on the scale it is.

The power of music comes more through its use of relative pitch: if two notes are played one after another, we can judge what the ratio of their pitches is. Thus, if we take a musical tune, and multiply the frequencies of all notes by some constant factor, the tune still sounds the same tune (of course, we do have some ability to judge absolute frequency as well: something very shrill sounds very shrill and something very basal sounds very basal).

So a natural question is: what relative frequencies sound good in combination, and why? That's what we'll try to study using cyclic groups.

First step: viewing as logarithms
The group we are interested in is essentially the group of possible ratios of frequencies, under multiplication. This is just the group of positive reals under multiplication. However, we observed that any frequency sounds a lot like its double frequency, so we'd like to quotient the group of positive reals by the subgroup generated by integral powers of $$2$$.

It may be more convenient to view this logarithmically. By taking logarithms to base 2, we can identify the group of positive reals under multiplication, with the group of all reals under addition. Further, what we now want to do is identify any two numbers that differ by an integer. In effect, we're quotienting the additive group of reals, $$\R$$, by the additive group of integers, $$\mathbb{Z}$$, to get a group of reals mod 1: this represents the interval of an octave: from a given note, to a note one octave higher.

The big question
What isn't clear once we take logarithms is: why does the interval of an octave get divided into 12 parts? In other words, why do we have 12 notes to an octave? And why, further, are there seven natural notes, some sharp notes, and some bass notes? The answer lies in a somewhat mathematically imperfect numerical coincidence. Fortunately, the mathematical imperfection of this coincide is too small for our ears to notice, and so we can enjoy music. To understand the answer, we need to step back from the logarithms and go back to looking at frequency ratios.

We'd discussed a little while ago why, if one frequency is twice the other, they sound similar. Essentially, it is because each time the lower frequency note repeats, so does the higher frequency note (though the higher frequency note also repeats an extra time). What happens if one frequency is 3/2 times another? Then, every second repetition of the higher frequency note corresponds to a third repetition of the lower frequency note. Thus, even though neither note contains each other, there is a harmony between them that makes them particularly pleasing to the ear when combined. This combination is termed the perfect fifth in music.

A similar nice effect is encountered if we play the perfect fourth: notes with a ratio of 4:3. In fact, the perfect fourth and perfect fifth are inverses of each other multiplicatively, because (3/2)(4/3) = 2, and doubling the frequency results in an equivalent note.

Now, if we start with a frequency, and keep multiplying it by (3/2), we're never going to strictly get back to an equivalent frequency again. That's because $$(3/2)^n = 2^m$$ has no positive integer solutions. However, it turns out that $$(3/2)^{12} \sim 2^7$$, and the ratio is so close as to be practically indiscernible to the human ear. This means that doing the perfect fifth, twelve times, lands one back at almost the same note, seven octaves higher.

Back to the language of logarithms, this translates to $$\log_2(3/2) \sim 7/12$$. Thus, when we're looking at multiples of $$\log (3/2)$$ viewed mod 1, we cover all multiples of 1/12. The perfect fifth is thus, upto a reasonable approximation, a generator of a cyclic group of order 12, that divides the octave into twelve equal parts.

Harmonic versus equal temperament
Since there's a slight mismatch between $$\log_2 (3/2)$$ and 7/12, we do have two choices when calibrating a musical instrument: whether to create adjacent notes whose frequencies have a ratio of precisely $$2^{1/12}$$, or to ensure that, as far as possible we get precise perfect fifths. Just intonation, which was followed in the ancient times, chose the latter, making a few minor adjustments (this is the harmonic scale approach: there is a harmonic relation between the frequencies of the notes: half of one is one-third of the other). Most modern instruments, however, are tuned to equal temperament: the frequency ratios between adjacent notes are always chosen as $$2^{1/12}$$. This is the geometric scale approach: the notes increase in geometric progression.

Textbook references

 * , Chapter 23 (Groups and music), Page 429-450