Michael F Dorman & Blake S Wilson. American Scientist. Volume 92, Issue 5. Sep/Oct 2004.
Ludwig van Beethoven was 28 years old when he first noticed a ringing and buzzing in his ears. Soon he was unable to hear high notes from the orchestra; speech became indistinct. By 1802, four years after the first symptoms, he was profoundly deaf.
Beethoven fell into a deep depression. he describes this period in his Heiligenstadt Testament, meant to be read after his death:
For me there can be no relaxation in human society; no refined conversations, no mutual confidences. I must live quite alone and may creep into society only as often as sheer necessity demands … Such experiences almost made me despair, and I was on the point of putting an end to my life-the only thing that held me back was my art … thus I have dragged on this miserable existence.
In 2001, Scott N. was 34 and had lost all of his hearing. A surgeon inserted 16 tiny electrodes into his inner ear, or cochlea, and connected them to a small package of electronics implanted under the skin. A year later, Scott came to author Dorman’s laboratory at Arizona State University to test his understanding of speech. The results were extraordinary: Scott recognized 100 percent of more than 1,400 words, either in sentences or alone, without any prior knowledge of the test items.
As impressive as this performance was, the cochlear implant did not restore normal hearing to Scott. The electrode array produced a stimulus that was only a crude mimicry of the signals in a normal cochlea. But as this example shows, a very high level of functionality can be restored by a neural prosthesis that does not recreate the normal system. For the thousands of people who have received a cochlear implant, even an imperfect restoration of hearing reconnects them to the world of sound. And it allows many of them to use that most critical toy of modern life, the cell phone.
Although cochlear implants have a 40-year history culminating in the current generation of high-performance devices, hearing restoration is not universally welcomed. Among members of the Deaf community, the absence of hearing is not necessarily viewed as a disability. Some deaf parents refuse implants for their deaf children, triggering an impassioned debate between those who agree and those who challenge the decision. This article avoids that controversy to focus on the science of cochlear implants. But recent findings have influenced the temperature, if not the substance, of the debate. As we point out, hearing must be restored at a very early age if speech and language skills are to develop at a normal rate. The decision to use or forgo the implant cannot wait until the child-who must bear the consequences-reaches the age of consent.
In normal hearing, sound waves traveling through air reach the tympanic membrane (ear drum) via the ear canal, causing vibrations that move the three small bones of the middle ear. This action produces a piston-like movement of the oval window, a flexible membrane in the bony shell of the cochlea. Inside the fluid-filled cochlea, oscillations from the oval window initiate a traveling wave along the basilar membrane (one that divides the cochlea along its length). Another flexible membrane, the round window, moves in a complementary way to maintain the volume of the incompressible fluid in the cochlea.
The basilar membrane has graded mechanical properties. At the base of the cochlea, near the oval and round windows, it is narrow and stiff. At the other end of the cochlea, the apex, the basilar membrane is wide and flexible. These mechanical properties give rise to a traveling wave of displacement and to points of maximal response according to the frequency or frequencies of the pressure oscillations. For a wave with a single frequency, displacement increases up to a particular point along the membrane and then drops precipitously. High frequencies produce maxima near the base of the cochlea, whereas low frequencies produce maxima near the apex.
Movements of the basilar membrane are sensed by a line of hair cells, which are attached to the top of the membrane in a matrix called the organ of Corti. Each hair cell has fine rods of protein, called stereocilia, emerging from one end. When the basilar membrane moves, these rods bend as if they were hinged at their bases. The deflection initiates a chain of electrochemical events that causes electrical spikes, or action potentials, in cells of the spiral ganglion. These cells conduct the signal to a relay station in the brainstem called the cochkar nucleus. The information ascends through multiple other nuclei on its way to the auditory cortex, the portion of the forebrain that processes auditory information.
Within this circuit of cells, a sound’s frequency is encoded by two mechanisms. The first is a place code, which indicates the spot along the tapered basilar membrane that moves the most. Stereocilia on the hair cells respond to this displacement and cause action potentials among the closest spiral-ganglion neurons. The second mechanism is a temporal code that is produced when neurons become synchronized, or phase-locked, to the period of an acoustic wave. In normal hearing, neural responses can easily match frequencies up to about 1,000 hertz. This phase-locking ability declines progressively at higher frequencies. The perception of frequency is probably based on some combination of place and temporal codes, with the temporal code being effective for low frequencies only.
Hearing is lost when hair cells become so damaged that they cannot stimulate cells of the spiral ganglion. Without regular activity, the portion of that ganglion cell that receives signals, the dendrite, may atrophy and the cells may die. Fortunately, even in the case of complete hearing loss, some spiralganglion cells survive and remain connected to the appropriate frequencyreceiving areas in the cochlear nucleus. If electrical current from the implanted electrodes can cause action potentials among the remaining cells, then hearing can be restored. And if multiple groups of neurons (think of these as “neural channels”) can be made to respond in low, middle and high frequency parts of the cochlea, then perception of speech can be restored as well.
Slicing the Spectrum
How many channels, these slices of the frequency spectrum, are necessary to restore speech understanding? Following the early work of Robert Shannon and others at the House Ear Institute in Los Angeles, one of us (Dorman) answered this question with collaborator Philip Loizou of the University of Texas at Dallas. We used so-called bandpass filters to divide the spectrum of speech into a relatively small number of frequency bands or channels. A microprocessor measured the energy in each band every few milliseconds and transformed the signal into amplitude-modulated sine waves, each centered on one of the frequency bands. When we played these simplified audio signals to normal-hearing listeners, they understood 90 percent of the words in simple sentences, even when we used as few as four channels. Eight channels allowed them to identify 90 percent of isolated words. Under noisy conditions, more channels were needed to match this performance, and the more channels used, the better the comprehension. These observations show that in a quiet environment, speech can be well understood with a relatively small number of channels-a fact that is central to the success of cochlear implants.
The variability of normal speech also helps. As our brain decodes speech sounds, it uses cues, called formants, which identify consonants and vowels by specific concentrations of energy in the frequency spectrum. But formants do not have a fixed frequency, even for the same sound, because vocal-tract geometry varies from speaker to speaker. So instead of being discrete points, the acoustic signatures of speech sounds can be thought of as ellipses in frequency space. Even in infants, the system that perceives speech is designed to be flexible so that it can “hear through” variations in the signal to identity a consonant or vowel. This flexibility allows a very reduced description of speech to be recognized with accuracy.
Figure of Speech
The timing and frequency of consonants and vowels in a spoken word determine its acoustics. For example, slow changes in overall amplitude indicate the timing of syllables, phonetic transitions within syllables and boundaries between silence and sound. In terms of frequency, the vocal tract produces multiple concentrations of energy between 300 and 5,000 hertz as it produces speech sounds.
The slow amplitude variations of speech are referred to as the speech envelope, an aspect that conveys a surprising amount of information. Victor Zue at the Massachusetts Institute of Technology classified the envelope shapes of 126,000 words by applying a series of only six shape variations. He found that, on average, only 2.4 word candidates matched a given sequence. This observation suggests that implant patients could understand speech much better if their implants conveyed the shape of the envelope, thereby constraining the number of word possibilities. However, envelope shape by itself does not provide enough information to understand speech. To identify specific words, frequencies in the 300-5,000 hertz range must be extracted from the signal.
In any vowel or consonant, the frequencies of the first two energy concentrations comprise the essential signature of the sound. For example, in Dorman’s voice the vowel in bat has energy peaks at 624 and 904 hertz. The vowel in bought has peaks at 620 and 1,055 hertz. Because a very small difference in the acoustic pattern-150 hertz in this case-can significantly alter the meaning of the word, investigators initially assumed that a neural prosthesis for hearing would need a very large number of channels. As we have seen, this did not turn out to be the case, at least for low-noise environments.
Hardware For Hearing
In a deafened ear, hair-cell failure severs the connection between the peripheral and central auditory systems. Cochlear implants restore the link, bypassing hair cells to stimulate directly the cell bodies in the spiral ganglion.
A cochlear implant has five main components, only two of which are inside the body. Above the outer ear, an external microphone picks up sounds in the environment and directs them to a sound processor, which sits inside a case behind the ear. The processed signals are conveyed to a high-bandwidth radio-frequency transmitter, which beams the information through a few millimeters of skin to a receiver/stimulator that has been surgically implanted in the temporal bone above the ear. The signals then pass to an array of electrodes inside the cochlea. Target cells in the spiral ganglion are separated from the electrodes by a bony partition.
Scott N.’s device uses the continuous interleaved sampling, or CIS, strategy to convert acoustic signals into a code for stimulating the auditory nerve. One of us (Wilson), along with colleagues at the Research Triangle Institute and Duke University, developed the CIS strategy. It starts by filtering a signal into frequency bands (16 for Scott N.). For each band, the CIS algorithm converts the slow changes of the sound envelope into amplitude-modulated trains of biphasic (having negative and positive components) pulses at the electrodes. The processor sends information from lowfrequency channels to electrodes in the apex and information from highfrequency channels to electrodes in the base of the cochlea. This organization maintains the logic of the frequency map in a normal cochlea.
Scott N.’s ability to understand speech demonstrates that an implant can restore a normal level of speech recognition in quiet environments. However, Scott’s case is exceptional. Speech is neither as clear nor as easy to understand for most patients. Although average scores range between 80 and 100 percent correct on tests of sentence understanding, the comprehension of isolated words lies between 45 and 55 percent. The gap between scores shows that average patients fail to hear the details of many spoken words. Sentence context allows the missing elements to be reconstructed.
What is the difference between Scott’s auditory system and that of a patient with average or below-average speech understanding? A patient’s performance probably depends on many factors, including the number and location of surviving cells in the spiral ganglion, the spatial pattern of current flow from the electrodes, and the degree to which neurons in the brainstem and cortex can encode frequency by phase-locking their firing patterns. When only a few cells survive in the spiral ganglion-for example, after a long period of deafness-the electrode stimulation is less able to convey frequency-specific information to the cochlear nucleus and cortex. And if the surviving cells are clustered at one end of the ganglion, then the signal that does arrive at the cortex will lack the range of frequencies needed to understand speech. Even if there are neurons along the length of the cochlea, individual electrode currents need to be highly focused to provide independent channels of stimulation (and therefore, information). If these currents overlap, either because the signal spreads too far through the conductive cochlear fluid or because of individual differences in cochlear anatomy, then the number of functional channels will be less than the number of electrodes.
In the Ears of a Child
Adults who lose their hearing and later receive a cochlear implant can associate the new stimulation patterns with their memories of what speech should sound like. Children born deaf do not have this luxury. Yet a team led by Richard Miyamoto and Mario Svirsky at Indiana University has found that congenitally deaf children who receive a cochlear implant during their first or second years can learn spoken language at a normal or near-normal rate. These children can enter first grade with age-appropriate language skills-a testament to the adaptive ability of young neural systems. This plasticity undoubtedly plays a major role in the success of implants at an early age.
Scientists can observe the neural changes in young children fitted with implants using the tools of neurophysiology. Author Dorman and his colleagues at Arizona State University, in collaboration with Anu Sharma and her team at the University of Texas at Dallas, found that the brains of deaf children under the age of four are quickly reconfigured in response to the signals from an implant. Using electrodes on the scalp, we were able to record sound-evoked electrical activity in the cortex. Within a week after the implant was activated, we saw changes in the latency of neural responses to sound. Within six months, children who had heard nothing for up to three and a half years showed age-appropriate timing of cortical activity in response to sound.
Children who receive the implant after their seventh birthday have less success than younger patients in developing speech and oral language. We saw corresponding evidence for this age limit in the cortical-latency experiments. After an initial change, the delay of cortical activity in response to sound remained abnormally long in older children, even after considerable experience with the implant.
Sadly, the same property that helps the implant work so well in preschoolers limits its effectiveness for older children. During the extravagant growth of neural connections during the first years of life, areas of the brain that lack stimulation can be usurped or recruited to process active signals that usually go to other parts of the brain. In this case, regions that would normally analyze auditory inputs might be appropriated by the spread of visual or other sensory connections as the child gets older. And once an area is allocated to a different task, returning to the original task is difficult or impossible, depending on age. This narrow window of opportunity has also been observed in animal experiments.
The different outcomes of implants in younger and older children reflect different patterns of neural organization in the children prior to implantation. Using positron-emission tomography (PET), Dong Soo Lee and his colleagues at the Seoul National University found extremely low activity in the auditory cortex and surrounding brain areas in children who were deaf for a relatively short period-which is what one would expect given that there was no auditory input. This group of children adapted well to cochlear implants.
However, in children deprived of sound for more than 7 years, PET scans before the implant surgery showed a more normal level of activity in the auditory cortex and language areas. Because this cortex was not activated by auditory input, it must have received input from some other sense-probably vision. It is reasonable to suppose that the encroachment of other functions into brain areas normally devoted to auditory processing is one reason that older children have a much more difficult time acquiring speech and oral language skills after receiving the cochlear implant. This biological reality adds an important codicil to the debate over cochlear implants for the deaf children of deaf parents. By the time a deaf child reaches the age at which he or she might elect to have an implant, it will be too late to achieve the best outcome.
An Elusive Pleasure
Scott and others like him who achieve high levels of word recognition report that speech sounds natural and clear through the implant. No patient (in our experience) has described music in this fashion.
This result points to a fundamental difference in the requirements for speech understanding and music appreciation. Implants do not need to reproduce the precise frequencies of speech to preserve meaning. But precision is absolutely essential for music. An octave, for example, cannot be stretched in the way that frequency components for speech can be stretched. If the A above A440 is to be heard as an octave higher, then an implant must convey a signal at 880 hertz. A small error yields a different note. Although we have had a small amount of success using octave intervals to tune signal processors for a few patients with extensive musical backgrounds, creating pleasant-or even tolerable-musical experiences for the majority of cochlear-implant patients remains an elusive goal.
The Next Verse
One advance that we will see shortly is the union of electric and acoustic stimulation, or combined EAS. Many hearing-impaired people have some ability to hear low frequencies but retain little or no hearing at higher frequencies. If an electrode array can be inserted about two-thirds of the way into the cochlea, then hearing at 1 kilohertz and above can be restored by electrical stimulation. And if the surgery doesn’t damage the distal third of the cochlea, then electrical and acoustic hearing can together provide access to the range of frequencies necessary for speech understanding.
Christoph von Ilberg and his colleagues at the University Clinic at Frankfurt were the first to demonstrate the feasibility of this approach. Recent studies have shown that acoustic hearing can be preserved in 75 to 90 percent of patients in whom a 20 millimeter-long electrode array is inserted into the cochlea, which is normally 28 to 35 millimeters long. Experiments from author Wilson’s lab have shown that just a small region of acoustic hearing below 500 hertz greatly improves the performance of electrical hearing, even when acoustic speech comprehension is near zero. For example, one patient who understood only 10 percent of words via acoustic stimulation and 60 percent by electric stimulation recognized 90 percent with the combined stimulation.
We suspect that auditory nuclei in the brainstem, which sort signals from noise, recognize patterns of neural discharge that are unique to acoustic stimulation. The output from even a small region of normal hearing may engage these nuclei in a way that electrically evoked patterns cannot, thereby allowing more of the signal to reach higher levels of auditory processing. Thus the combination of electric and acoustic stimuli can have a synergistic effect on speech understanding, especially in noisy environments.
Combined EAS has produced some remarkable results for patients with residual hearing in the low frequencies, and patients with residual hearing up to 1,000 hertz may one day become candidates for the procedure. The popularity of this approach as a treatment for severe, but not total, hearing loss will depend on how reliably the remaining hearing can be preserved. Such preservation might be improved with shorter electrode insertions or with pre-treatment of the cochlea with certain drugs. However, shorter arrays also reduce the performance of electric stimulation-leaving the patient with few options if the remaining hearing is lost. These trade-offs-electrode insertion depth versus preservation of unaided hearing, combined EAS performance versus the performance of electric stimulation alone-remain to be fully explored.
Better Hearing Through Chemistry
In the near future, drug-delivery systems will be integrated into the design of a cochlear implant. These systems will attempt to do two things: arrest the shriveling or demise of remaining hair cells and neural structures in the cochlea, and promote the growth of neural tentacles called neurites from spiral-ganglion cells toward the electrodes. If neurons in the vicinity of each electrode can be kept alive, and especially if they are brought closer to the electrodes with the growth of neurites, then each electrode is more likely to function as an independent channel of stimulation.
One approach is to inject growth-promoting neurotrophins into the cochlea. In experiments with deafened guinea pigs, Takayuki Shinohara and his coworkers at the Karolinska Institute in Stockholm showed that by injecting brain-derived neurotrophic factor and ciliary neurotrophic factor, they could increase the survival and, critically, the sensitivity of spiral-ganglion cells. This outcome hints at future implant designs in which neurites from spiral ganglion cells grow toward multipurpose electrodes that deliver electrical and pharmacological stimuli.
A second approach is to block apoptosis, the normal process of cell death following injury. These self-destruct messages can be triggered by many events, such as acoustic trauma or ototoxic drugs, which work through a socalled mitogen-activated protein kinase (MAPK) signaling pathway. The pathway can be blocked at various points. One of the links in this chain is the protein called c-Jun N-terminal kinase (JNK). This enzyme is the target of a peptide inhibitor developed by a multi-center, multi-national team that includes Jing Wang of the University of Montpelier and Thomas Van De Water of the University of Miami. By blocking JNK, they prevented hair-cell death and hearing loss following acoustic trauma or administration of the ototoxic antibiotic neomycin.
This outcome is especially relevant for future applications of combined electric and acoustic stimulation. Injecting a MAPK-JNK blocker could buffer existing hair cells from damage caused by the surgery. In that case, the odds of preserving acoustic hearing might increase, making combined EAS into a viable therapy for a very large number of hearing-impaired people.
Imagining Beethoven Today
We wonder how Beethoven might feel if he were alive today and had received a cochlear implant. We expect he would understand speech well enough to “relax in human society” and engage in “refined conversations” and “mutual confidences.” he would avoid the isolation that caused his despair. The sound of his art, however, would certainly fail to bring him joy. We will need many more years of hard work and good luck to make this time-travel story end with an idyllic, or, if you like, a pastoral tune.