The Indexical Inscription of the AcousticJohn Puterbaugh, 1994 |
|
|
Introduction Invariant sound reproduction requires a particular form of representation
based on order and persistence over time, in other words memory. Metaphors
provide one approach towards understanding memory, allowing someone to
conceptualize one thing in terms of another. Metaphors establish heuristics
for problem solving by incorporating poetic, evocative, or structural
forms which operate through analogy, transference, and organization, respectively.
Models are one example of metaphors. Technological change has stimulated
models for memory ranging from the aviary, storehouse, the wax tablet,
hydraulics and clockwork, to more contemporary models centered around
the computer [see Rose 1992, Krell 1990, Courruthers 1993, Yates 1992].
Advancements in molecular biology and neuroscience, having established
a physiological basis for memory, offer new models for memory. These models
can be used to design technologies for sound reproduction. The following pages are a preliminary investigation into memory and its
role in technologies used to reproduce sound. Three instruments are taken
as examples of sound reproducing devices: the carillon, phonograph, and
compact disc player. Terminology, description, and classification based
on these instruments will be defined. The ear and human memory will be
given as alternatives to these instruments, providing another paradigm
for the recording and reproduction of sound. Instruments The carillon contains a musical cylinder with metal notepins (elevations), claviers (levers), hammers, and bells. As the cylinder rotates, the lever raises and releases the hammer causing it to strike the bell. Mechanical instruments such as music boxes and player pianos are based on this simple procedure used by the carillon to reproduce music: an inscription, carried by a medium, triggers an action on a sound source. Specifically, elevations and impressions are retained by a cylinder, circular disk, or roll of paper, which strike, pluck, bow, or blow a sound-producing device (idiophone, chordophone, aerophone). In most mechanical instruments the recording process involves a human agent who transcribes a piece of piece music and then inscribes it onto a medium that is readable by the sound-reproducing instrument. The sound generation itself is never actually a reproduction because the instrument plays the piece of music through its bells, strings, pipes, etc. The ability of these instruments to reproduce music involves the use of explicit memory stored by an inscription within a medium, external to the sound generation. Using a phonograph to reproduce sound involves recording through microphones
(electroacoustic transducers). A record retains quasi-permanent impressions
of acoustic vibrations through vertical or lateral grooves embossed on
its surface. When played back, the stylus acts as a vehicle that responds
to the recorded inscriptions. These inscriptions are engraved sequentially
beginning at the perimeter and wind their way towards the center of the
disk. Early wax phonographs were engraved directly by the stylus without
the process of electrical transduction. In either case, the process of
recording was no longer done directly by a human agent. A record's grooves
are continuously varying inscriptions that are analogs of the sound's
changing amplitude. This is in contrast to the discrete nature of most
mechanical inscriptions. Phonographs use loudspeakers to reproduce recorded
sound. A compact disk, or CD, uses a laser beam instead of a stylus for transmitting
the stored representation of sound. In the case of the CD the representation
is digital, in which a dynamic property, its amplitude, is measured at
discrete points in time. The amplitude is quantized and stored as a sequential
pattern of bits, encoded as small pitts printed upon the disc. The surface
of the CD is transparent which allows the laser to pick up changes in
reflectivity caused by the pitts. These patterns of reflectivity are translated
into an electrical signal which can be converted into an analog signal.
The inscription itself is still sequential, like the mechanical instruments
and phonographs, but is generated by sampling the sound, a time domain
process. Definitions Reproducing an event such as music involves recording. Viewed as a process,
recording is transcribing, a transformation - the transduction and inscription
of sound. Each of the instruments uses some form of transduction (mechanoacoustic
or electoacoustic) and inscription (discrete, analog, digital). Viewed
as a thing, recording is a materialization; Theodore Levin calls it, a
"reification which transforms an acoustic-temporal event into a trace"
[Levin 1990]. Recording, as a material trace of some sound event, is based
on an actual causal relation between the sound and its representation. As mentioned above, recording is a transcriptive process. Transcription
means to make a copy and requires some form of transduction. Transduction
is the process of transmitting one form (an input) of energy to another
(an output), which often involves some type of change between the two
forms. Since copying involves changing one form of energy into another,
transcription necessarily involves transduction. In terms of sound reproduction,
transducers act as sound generators and receivers [Pohlmann 1989]. In
the case of the carillon, mechanical energy from the hammers and levers
is transduced into acoustic energy through the bell. Microphones and loudspeakers
convert between electrical and acoustical energy. There are many types
of transducers that can be used for sound reproduction: electromagnetic,
electrostatic, piezoelectric, dynamic, magnetic, and carbon [Parker 1988].
Retaining a transcription involves inscription. An inscription is the
registering of quantities on or within some medium. It contains and retains
the information required for invariant reproduction. An inscription that
denotes its object, sound or music, through some causal correlation or
physical connection will be termed an indexical representation. The carillon,
phonograph and CD use inscriptions that are indexical as a means for representing
sound. An indexical inscription is the simplest form of memory that can
be used for invariant reproduction. An indexical inscription such as the grooves on a phonograph contain
many individual indices. Indices operate through contiguity and not resemblance.
In other words, a particular sound intensity is encoded because it is
present at the time of recording. It is not due to some intrinsic character
or quality it possesses that can be discriminated. This means that an
index has no significant resemblance to the object it represents [Pierce].
If the basis of representation was due to resemblance, the inscription
would be characterized as iconic. The Ear The ear, consisting of outer, middle, and inner parts, acts as a very
specialized transducer converting mechanical vibrations into neural firing
patterns. The outer ear filters incoming sound in two ways - through the
pinna (folds and flaps on the exterior ear) and through the ear canal
(a hollow chamber). The folds of the pinna reflect the sound, adding many
short time delays, functioningy similarly to a comb filter [Rodgers 1981].
This filtering boosts and attenuates various frequencies, an important
role in the front-to-back and vertical localization of sound in the environment.
The ear canal is a resonator for frequencies between 2000 and 5000 Hz
with a peak at 3000 Hz, an area comprising most acoustic energy produced
by music and speech [Handel 1993]. The middle ear connects the ear canal
with the fluid filled cochlea. Fluids have a higher acoustic impedance
than air (meaning it is difficult for vibrations in the air to cause water
to vibrate) which the middle ear corrects by increasing the energy between
the ear drum (at the end of the ear canal) and the oval window (at the
beginning of the cochlea) through a mechanical mechanism made up of the
hammer, anvil, and stirrup [Kelly 1992]. These vibrations, resulting from
the mechanisms in the middle ear, produce a complex spatio-temporal pattern
of displacements along the basilar and tectoral membrane (located in the
cochlea) [Yang 1992]. The displacements along the basilar membrane cause
the cilia (hairlike filaments on the basilar membrane) to bend. The cilia
are connected to inner hair cells. The cilia together with the inner hairs
provide the mechanical-to-neural transduction by connecting the basilar
membrane to the auditory nerve. The inner ear has been described as a
frequency analyzer in terms of Fourier analysis [see Kelly 1992] but it
is probably more suitable to consider it as a cascade of low-pass filter
sections [see Kates 1993]. Joe Schoome confirms this stance by stating
that frequency analysis is achieved in the cochlea by a bank of low pass
frequency filters based on the mechanical gradients of the basilar membrane
[Bats]. In summary, the outer ear gathers and filters the sound; the middle
ear transduces air vibrations into fluid vibrations; the inner ear divides
the sound into frequency components which are transduced in parallel into
neural firing patterns. As shown, the ear's transduction process alters the content of the original
sound significantly. Only information based on features of the modified
sound is transmitted. Further modifications, other than the physical apparatus
of the ear described above, are based on a listener's past experiences.
Incoming sensory signals are routed in parallel to many processing areas
which are reciprocally connected. These areas send information back to
the ear through the reciprocal connections. There are actually more connections
feeding from the brain down to the ear than there are from the ear feeding
into the brain, top-down versus bottom-up connections, respectively [Reed
and Singer 1993]. The brain can alter the incoming signals by sending
messages to the cochlea through links to the outer hair cells (located
underneath the tectoral membrane), allowing the adjustment of frequencies
during analysis. For example, frequencies that are present in a familiar
voice can be boosted during conversation while noisier elements are simultaneously
attenuated. As Joe states, "There is growing evidence that the nature
of our auditory experience is not a precise reflection of the physical
properties of stimuli but rather a highly dynamic reconstruction process
which is not only modulated at a rapid time scale by state changes and
top-down projections but is also subject to slowly developing and long
lasting modifications of processing circuits." Memory Memory is a capacity. Recall and recollection, storage and categorization,
association and reconstruction are facets of this capacity. Generally
speaking memory involves information that is received through the senses,
filtered in terms of past experience and current context, and then stored.
Storage creates connections based on the content of the input and its
associations. Remembering is the reactivation of the information. Learning
is closely related to memory and involves many activities including conditioning
and training, habituation and imprinting, trial-and-error and insight
[Rahmann 1992]. Von Foerster suggests that the faculties to perceive,
remember, and infer, used in learning and cognition, cannot be isolated:
"(i) Omit perception: the system is incapable of representing internally
environmental regularities. (ii) Omit memory, the system has only throughput.
(iii) Omit prediction, i.e., the faculty of drawing inferences: perception
degenerates to sensation, and memory to recording" [Von Foerster
1967]. In terms of the neural biology of memory, Eric Kandel states that, "much
of what is know can be summarized in just four principles: (1) memory
has stages and is continually changing; (2) long-term memory may be represented
by physical changes in the brain; (3) the traces for memories are localized
in multiple regions throughout the nervous system; and (4) reflexive and
declarative memories may involve different neuronal circuits" [Kandel
1992]. The term "stages" refers to various levels of processing
in memory such as short, medium, and long-term storage. For example, about
16 bits of information (out of the roughly 100,000,000,000 bits received
through the sensory organs) can enter short term storage, which has a
capacity of 100 - 400 bits lasting around 6 - 25 seconds [Rahmann 1992].
Reflexive means memories that are gradually accumulated such as perceptual
and motor skills, while declarative memory depends on some form of conscious
reflection for retrieval [Kandel 1992]. Concepts such as recall and recollection or reflexive vs. declarative
are not very useful for specifying a particular model of memory. A model
becomes more feasible when learning stated in terms of the notion of pattern
detection and association, in other words, locating concomitant properties
found in our perception of the environment. In Von Foerster's terms, detecting
concomitant properties is the basis of inductive inference (the principle
of generalization), which is one form of learning. A recording system
based on the auditory system involves a parallel, frequency-based input.
Rather than being solely indexical such as the carillon, phonograph, and
CD, the input would be based on reseblance: an iconic feature detector.
More importantly, the recording system would have an auditory memory (history)
which can beused to modify incoming information through feedback. Analog
and digital cochlea models have been described by Lyons 1992, Yang 1992,
and Kates 1991, 1993. Just as the ear transduces sound patterns into neural
patterns, a feature detector, based on the cochlea, could be fed into
an artificial network of neurons, a neural network. Neural networks are
systems with the ability to learn: the aquisition or arrangement of processes
needed to solve some problem. Problems can be thought of as tasks a particular
"user" of a network wants accomplished. Neural networks are
useful for categorizing groups of information and recognizing patterns
within an envirnment. A network can be trained according to learning schemes
such as supervised or unsupervised learning. These learning shemes can
be described in terms of a teacher. In the case of supervised learning,
a teach external to the network provides feedback by specifying the desired
output every time the network, itself, generates an output [Quinlan 1991].
Unsupervised learning does not utilize a teacher. This learning is based
on the content of the data it receives, tending to emphasize information
such as "membership in clusters of similar patterns or highly correlated
features" [Hrycej 1992]. In other words, rather than explicitly telling
the network what to learn and output, they learn by experience: their
teacher is the environment they are contained within. Retrieving information
from networks is based on the content of the information presented previously.
Rather than giving an address, telling the network where to find information
to recall, the current input pattern's similarity to past associations
is used to reproduce the desired output. This type of memory is generally
called "content-addressable" memory. Content and associative
memory models have been proposed by Anderson 1968, Nakano 1972, Kosko
1988 and Kanerva 1988 [see Anderson, Pellioisz and Rosenfeld 1990]. Bibliography
Handel, S. Listening: An Introduction to the Perception of Auditory Events.
Cambridge: MIT Press, 1989. Kandel, E.R. and Schwartz, J.H. Principles of Neural Science, 2nd edn.
Amsterdam: Elsevier Science, 1985. Kates, J.M. A Time-Domain Digital Cochlear Model. IEEE Transactions on
Signal Pro cessing, vol. 39, no. 12, December 1991. Kates, J.M. Accurate Tuning Curves in a Cochlear Model. IEEE Transactions
on Speech and Audio Processing, vol. 1, no. 4, October, 1993. Krell, D.F. Of Memory, Reminiscence, and Writing. Bloomington: Indiana
University Press, 1990. Levin, T. For the Record: Adorno on music in the age of its technological
reproducibility. October 55, 1990. Parker, S.P., ed. Acoustic Souce Book. New York: McGraw-Hill, 1988. Peirce, C.S. Collected Papers of Charles Sanders Peirce, Vol. I and vol.
II. London: Oxford University Press, 1960. Pohlman, K.C., ed. Advanced Digital Audio. Indiana: SAMS, 1991. Rahmann, H. and Rahmann, M. The Neurobiological Basis of Memory and Behavior.
New York: Springer-Verlag, 1992. Reed, R. and Singer, W. Sensory Systems. Current Biology, vol. 3, no.
4, August, 1993. Rodgers, C.A. Pinna Transformations and Sound Reproduction. Journal of
the Audio Engineering Society, vol. 29, no. 4, April, 1981. Rose, S. The Making of Memory. New York: Doubleday, 1993. von Foerster, H. What is memory that it may have hindsight and forsight
as well. Brain Circuitry and its Structural Basis. BCL Publications, 1967. Yang, X. Auditory Representations of Acoustic Signals. IEEE Transactions
on Information Theory, vol. 38, no. 2, March 1992. Yates, F.H. The Art of Memory. London: Pimlico, 1992. |
|
| Copyright
© 1994 John Puterbaugh |
|