The Secret Sauce in Language and Music

WIRED reboot

Sep 07, 2022

There is a simple secret underlying how it is that apes like us came to have language and music. And although it's long been a mystery to us, it would be obvious to a fish on a terrestrial expedition to observe us: "One is immediately struck by how much of the naked ape's day is spent producing and absorbing the sights and sounds of terrestrial habitats," it would say. "They communicate via drawings of opaque objects and via the sounds of solid-object events. And for the purposes of what appears to be entertainment they create and listen to the sounds of people swimming -- ahem, I mean walking -- about."

Because fish emanate from an environment radically different from our own, they are brilliant at this sort of thing: pointing out what we're too immersed in to see. When the fish refers to communication we do "via drawings of opaque objects", it is talking about our writing.

There are surface differences between written words and simple object drawings, and so we fail to notice (but fish do not) that there are deep similarities. As I have found in my research and discuss in my book The Vision Revolution, contours in human writing systems tend to combine together in the ways that they do in nature -- think Ls, Ts and Ys -- namely as they do in environments with opaque objects.

And what did that fish mean by suggesting we spend our days mimicking the sounds of events among solid objects? Turns out, we do this every time we speak. The fundamental auditory constituents of solid-object events are hits, slides and rings (periodic vibrations of the objects involved in hits or slides), and we find the same three "atoms" in speech: plosives, which sound like hits (t, d, p, etc); fricatives, which sound like slides (f, v, sh, etc); and sonorants, which sound like rings (like a, u, w, r, y, etc). I show in my latest book, Harnessed: How Language and Music Mimicked Nature and Transformed Ape to Man, that solid-object events have an identifiable "grammar" of sorts, one that also appears in the patterns across human speech from diverse languages.

Writing and speech, our two main media for linguistic communication, have the shapes of fundamental categories in nature: opaque objects and solid-object events, respectively. And writing and speech have these shapes, I suggest, because that's the best way to harness an ape brain that never evolved by natural selection for reading or comprehending speech. By shaping writing to look like natural objects (most of which are opaque), our illiterate visual-object-recognition system gets transformed into a reading machine. And by structuring speech to sound like natural events (most of which involve solid objects), our eventrecognition system becomes a speech-recognition system.

Our brain's tight fit to writing and speech is not because we evolved by natural selection to read or comprehend speech, but, rather, because the structure of writing and speech culturally evolved to fit our brain...by looking and sounding like nature, just what our brains can brilliantly process. I call this nature-harnessing -- that's the secret sauce.

And what sort of entertainment was the fish referring to when it mentioned the sounds of people walking about? The fish was describing music. The fundamental auditory signature of humans moving about and doing stuff -- more than 40 characteristics concerning the rhythm and tempo of our gait, the Doppler shifts of our directed movement, and the loudness due to our proximity -- is also found in music.

Originally published in the August 2011 issue of WIRED.

\_ooFWIRED -- hosted by Mark Changizi

Discussion about this post