Cover Story

Between Speech and Song

headphones placed down with visual soundwave
APS Fellow Diana Deutsch

APS Fellow Diana Deutsch

When APS Fellow Diana Deutsch was alone in a dark recording studio, fine-tuning some spoken commentary she recorded for a CD, she was stunned to hear the voice of a strange woman singing. After looking around to make sure she was still alone, she realized that the singing voice was her own. She had been looping the phrase “sometimes behave so strangely” over and over to find sounds that needed to be tweaked. Now the words themselves were behaving strangely:

They had morphed into song.

After getting over her shock, Deutsch was thrilled: “I realized that I had stumbled on a strange perceptual phenomenon, [one] which must reflect a close relationship between speech and song.” She began playing the looped phrase to anyone who cared to listen. She even tried it out on a room filled with 250 of her undergraduate students at the University of California, San Diego. After a few repetitions of the phrase, she raised her hands as if she were a conductor directing a choir. “On the downbeat, the class immediately sang along in chorus—and remarkably in tune!” says Deutsch.

This auditory “illusion” was potentially important. It contradicted an old idea that the brain analyzes words and music in completely separate areas: one in the left hemisphere and the other in the right hemisphere. In general, the idea that we have a rational left brain and a creative right brain is a myth, but how the brain circuitry for processing music and language works is still not clear. Neuroimaging now suggests that our systems for these functions overlap. “Some aspects of speech and music are processed by the same circuitry, up into and including the cortex,” says Deutsch.

Throughout history, human beings have been drawn to sounds “near the boundary between speech and song,” Deutsch says — including devotional chants, speeches, and even rap music. Poets and lyricists are our practical experts on the boundary, she adds, because they arrange words to enhance our perception of musical qualities.

The Evolutionary Path of Music and Language

One reason the boundary may be so fuzzy is that long ago, humans did not interpret sounds the way we do now. According to a long-standing theory, human beings originally heard sounds that were neither language nor music — a “protolanguage.” One fact that may support this theory is that most people speak tonal languages (for example, Mandarin). These languages — unlike English — depend on pitch perception. And because they have to detect more pitches as well as finer changes in pitch whenever they have a conversation with each other, tonal-language speakers have an easier time switching their focus between language and music. Some violinists will even entertain Mandarin-speaking audiences by playing phrases that have a “spoken” meaning.

Observing tonal-language speakers has led some scientists, including Deutsch, to adopt the proto-language theory. “My very strong bet is that western tonal music and English language both evolved from a protolanguage that had components of both music and speech. Then it divided. Music became more for emotion, rousing people to battle, and speech to convey information,” she says.

Yet there are other arguments besides the protolanguage theory. Daniel Levitin, a psychological scientist at McGill University, argued in his popular 2008 book, The World in Six Songs, that music was the source of language and other complex behaviors. But three years later, he’s become an agnostic on that question. Today, he says, “I could imagine a scenario in which less intellectually capable hominids had some form of language that was entirely literal, nonmetaphorical, that would require a number of cognitive operations to make the leap from language to music.”

One empirical approach for addressing the evolution question is to examine how babies develop in their first months of life. Babies begin to hear while they are still in the womb, and they can recognize music that they heard after they are born. Infants can also perceive rhythm and pitch in words. That cooing baby talk parents use could actually be a form of “musical speech,” says Laurel Trainor, a psychological scientist from McMaster University. Overall, infants “seem to get the messages in music earlier than meaning in language,” Trainor says. Babies under a year old, she says, hear sounds — both in speech and music — that adults don’t hear anymore because they’re not necessary within our culture.

This is a photo of the December 2011 APS Observer.Musicians Are Better Listeners

As humans develop, they gain the ability to distinguish music from language, but as Deutsch observed with her auditory illusion, a connection between the two remains in adulthood. This shared neural processing may be the reason that musical training can improve verbal skills.

In a study published this year in Psychological Science, a team of Canadian scientists found that only 20 days of classroom instruction about music boosted the vocabulary of preschoolers. The team, led by Sylvain Moreno (who is now at the Rotman Research Institute in Toronto), divided 48 four- to six-year-olds into two groups. One group watched cartoon characters who taught them about rhythm, pitch, melody, voice, and other basic musical concepts. The control group received lessons about art (such as shape and color) from the same characters.

The team scanned the children’s brains while the children did a simple task, and then tested their verbal skills. After the training, 90 percent of the children in the music group did better on the verbal tests and showed corresponding brain changes, while the kids who learned about art showed no differences before and after training. Similar results have been found with 8-year-olds who took music lessons for six months or studied painting.

The most dramatic improvements in verbal-processing skills, however, come from playing a musical instrument, says Nina Kraus of Northwestern University. Her research has shown that musicians — even amateurs who practice consistently — have better connections between their auditory-processing skills and verbal-processing skills, such as the ability to distinguish meaningful sounds from noise.

“I have been absolutely flabbergasted by … how much musical experience affects language processing,” Kraus says. She’s both heard — and seen — those effects. When people listen to music, she says, their brain waves on a brain scan resemble sound waves.

“You can take the brain wave and play it on a speaker,” she says. “It’ll actually sound like the original sound—not exactly, but enough so it’s recognizable.” Using this approach, she has found that a pianist’s brain waves closely match the waves of piano music. Related to language, she has found that the brain waves of tonal-language speakers tend to match the pitch of the music they are listening to better than waves from non-tonal-language speakers.

A Syntaxic Connection

With all this evidence supporting shared circuitry for music and the brain, another important question that comes to mind is, what specific functions do these circuits have?

“When we process music, we’re using brain networks that have other ‘day jobs,’ ” says Aniruddh Patel, a senior fellow at the Neurosciences Institute in San Diego. One such network, he proposes, recognizes syntax — the rules governing coherent sequences of information, such as words or notes.

To test his idea, Patel, along with colleagues Robert Slevc and Jason Rosenberg, had students at the University of California, San Diego, read sentences in segments on a computer screen. The students pressed a button every time they moved to the next segment. Half of the sentences had a tricky syntax, such as “After the trial, the attorney advised the defendant was likely to commit more crimes.” At first glance, it’s easy to think that the attorney was speaking to the defendant rather than about the defendant to someone else. If the sentence had standard syntax, it would be: “After the trial, the attorney advised that the defendant was likely to commit more crimes.”

While reading each sentence, the students heard a chord sequence. Some of these sequences contained an out-of-key chord, which created a more complex musical syntax, at the same point in time that the sentence became confusing. (For example, the out-of-key chord would sound at the word “was” in the sentence about the attorney.)

When the students heard an out-of-key chord, they took longer to press the button to move forward in the sentence. Patel argues that the circuitry the students were using to parse the confusing sentence was distracted by the off-key chord. Patel and his colleagues tried variations that did not violate normal syntax but were unexpected in another way, such as an odd word in a sentence instead of odd grammar. They also tried switching from piano to a pipe organ rather than going off-key. In those circumstances, students had less trouble. No one knows why, Patel says. Overall, the results, along with evidence from recent studies in other labs, support the idea that this one shared-circuitry system processes syntax for both music and language.

Exploring The Boundary

Deutsch continues to investigate the connection between music and language in the brain. Recently, she tried a different approach: She asked, how are words and music distinguished from each other?

In a study published this year by the Journal of the Acoustical Society of America, Deutsch and her colleagues argued that when we listen to speech, our perception of pitch (which is essential to music) is reduced. In the study, participants listened to a spoken phrase ten times in a row, and then they were instructed to say the phrase. Another group was told to say the phrase after hearing it only once. The people who listened to the repeated phrase sang (instead of speaking) the phrase when their turn came, and they reproduced the pitch of the spoken phrase better than the individuals who listened to the phrase once. So word repetition may prevent the natural reduction in pitch perception.

Deutsch is also still studying the auditory illusion behind the phrase “Sometimes behave so strangely,” but the reasons why the phrase slips over the boundary of speech to music so reliably remain mysterious. She hopes to find more phrases. One interesting thing she has found during her search is that the auditory illusion is enhanced by many hours of listening to repeated phrases. When she works too long in her lab, Deutsch has had moments when own speech slips into song right away — “even while I was speaking,” she says. “It was getting a little scary.”


Has it been studied how pitch and singing intonation are sometimes meaningful in spoken English? An example-if they want to end an interaction, English speakers will sometimes interrupt with a ‘sung’ goodby. This form of good-by is a three-note song, the last syllable getting two notes slurred together. It carries more meaning than the plain word, it add finality, and sometimes it says, imperatively, something close to ‘go away!’

I find it interesting that this linguistic element violates the usual separation between singing and speech in English.

APS regularly opens certain online articles for discussion on our website. Effective February 2021, you must be a logged-in APS member to post comments. By posting a comment, you agree to our Community Guidelines and the display of your profile information, including your name and affiliation. Any opinions, findings, conclusions, or recommendations present in article comments are those of the writers and do not necessarily reflect the views of APS or the article’s author. For more information, please see our Community Guidelines.

Please login with your APS account to comment.