Sound Synthesis with Physical Models (1996)

Introduction

“The dynamic behavior of a mechanically produced sound is determined by the physical structure of the instrument that created it” (DePoli 223). The device of production is fundamental to the character of a sound. Nevertheless, most forms of synthesis function without regard to the physical attributes of the sound creation mechanism. Traditional synthesis algorithms take into account primarily the result of sound creation and pay less attention to the control devices that shaped the sound phenomena. More current algorithms, such as additive synthesis, have made use of Fourier analysis and synthesis with a great deal of success. Unfortunately, Fourier techniques are insufficient for the representation of non-pitched material. Physical model synthesis considers the mechanism of a real or imaginary instrument as the focus of the representation. Therefore, a physical model algorithm is capable of recreating aperiodic and transient sonorities produced by the vibrations of the sound creating structure. Scientists and engineers have been making use of physical models for quite some time. In 1971, Hiller and Ruiz were successful in using physical models to create musically useful sounds (Borin 30). Since that time, electronic musicians have explored the creation of sound through the implementation of instrument models. Currently available microprocessors have enabled many composers and performers to make use of physical models with a minimal investment.

Basics of Physical Model Synthesis

Building Blocks

Figure 1 displays a basic two-block scheme for a model. Here it is possible to see the two significant ³building blocks² necessary for physical model synthesis: the exciter and the resonator. Each of these blocks is defined using one of several different approaches (as discussed below).

Black-Box and White-Box Techniques 

The resonator and exciter blocks are described in one of two ways: Black-Box or White-Box. These descriptions give a general idea of the complexity with which a model is constructed. Black box models (FIGURE 2a) are described only by an input and output relationship. The simplicity of the Black-Box model limits the choice of signals that can be involved in the description of the model and makes it difficult to choose the operative conditions of the model (Borin, 31-32). Thus, Black-Box techniques are not entirely useful for synthesis.  White-Box (FIGURE 2b) models describe the entire mechanism of synthesis. Consequently, this synthesis strategy results in a flexible model that is quite difficult to apply and can quickly create a very large number of elements requiring simulation (Borin 32). The masses, springs, dampers, and conditional links described below are only present in a White-Box model; in a Black-Box model they are simply assigned a function.

 Simple Objects

“A model must consider that which is manipulable, deformable, and animated by movements” (Florens 227). The primary objects in a sound-producing model are masses, springs, and dampers. Masses represent the material elements of the instrument, such as wood or brass, while springs and dampers represent linkage elements between the masses. These objects are obviously a substantial simplification to the actual sound creating mechanism, but much simplification is necessary to permit mathematical expressions in a single-dimension to be relatively accurate as well as manageable. Figure 3 shows a simple representation of a sound-producing object as a network.

 The Fourth Object

Using the three elementary objects previously described, the number and variation of available groups or “blocks” is infinite. Nevertheless, these objects cannot fully represent a sound-creating mechanism because they offer no provision for collision with one another. Thus a fourth type of object, the conditional link, becomes necessary. Conditional links are links that are not permanent. They are composed of a spring and a damper and their function is determined by the material that connects them as well as musical circumstance. A conditional link may represent a bow striking a string, the hammers of a piano, or a variety of other temporary excitations.

Physical Model Techniques

Exciters

Before the synthesis of excitation can occur, it is necessary to determine the initial condition of the exciter. At the most fundamental level there exist only two types of initial conditions: those in which only one state of equilibrium exists (percussion), and those in which the exciter begins a new cycle of excitation from a variable equilibrium (bowed and wind instruments). Direct generation Modeling is a black-box technique for those instruments that are persistently excited. This technique may include any system that can generate an excitation signal. The most commonly used example is the table-lookup generator. Direct generation is usually incorporated into feed-forward coupling structures (see below). Memoryless Nonlinear Modeling is a black-box technique often used to model the exciter of wind instruments. This is because it generates an excitation signal that is derived from an “external” input signal that normally incorporates the excitation actions of the performer (FIGURE 4)(Borin 34). This model is also capable of using information that is coming from the resonator. Thus, the resonator’s reaction to an excitation influences the excitations that follow.  Mechanical Modeling is a white-box technique where the exciter is described using springs, masses and dampers (FIGURE 5). Generally, excitations of this type are represented by a series of differential equations that govern the dynamic behavior of these elements (Borin 34). These models can be used to model almost any instrument.

Resonators

“The description of a resonator, without serious loss of generality, is reducible to that of a causal, linear, and time-varying dynamical system” (Borin 35). As with exciters, there are both black- and white-box techniques for modeling resonators. Both techniques can produce surprisingly musical results. Transfer-Function Modeling is a simple, black-box technique that ignores the physical structure of the resonator. The transfer-function model usually implements a transformation of pairs of dual variables, such as pressure and flow or velocity and force (Borin 35). Because this resonator is such a generic device, it is not the most musically useful resonator model. Mechanical Modeling of the resonator is very similar to mechanical modeling of the exciter. A series of differential equations are used to simulate the dynamic behavior of virtual masses, springs, and dampers. Waveguide Modeling is an efficient technique that is based on the analytic solution of the equation that describes the propagation of waves in a medium. For example, the waveguide model of the reed of a wind instrument requires only one multiply, two additions, and one table lookup per sample of synthesized sound (Smith 275). Because of the small number of simulations that it requires, this technique was the first to be incorporated into commercially available synthesizers.

Interaction

Just as there are several strategies for modeling exciters and resonators, there are numerous methods for controlling the interaction between the blocks. The Feed-forward technique (FIGURE 1) is the simplest structure by which exciter and resonator may be coupled. The transfer of information is unidirectional which prevents the excitation from being influenced by the resonance. “This structure lends itself particularly well for describing those interaction mechanisms in which the excitation imposes an initial condition on the resonator and then leaves it free to evolve on its own, or in which the excitation can be treated as a signal generator” (Borin 37). The Feedback technique (FIGURE 6) is a slight variation on feed-forward. The transfer of information is bi-directional and permits the modeling of most traditional instruments where a vibratory structure is influenced.

The most sophisticated interaction scheme, which is also the most computationally complex, is the Modular Interaction method. This model incorporates an interconnection block (FIGURE 7). The interconnection block acts as an interface; its main purpose is to separate the excitation from the resonator, so that they can be designed independently (Borin 38). Thus, the third blacks becomes the master of the information exchange between the exciter and the resonator.

Current Hardware and Software Developments

CORDIS/ANIMA

The CORDIS/ANIMA system is designed for the mechanical simulation of physical models. The CORDIS system originally ran on the LSI-11 type microcomputer from Digital Equipment Corporation. The system is capable of decomposing all aspects of the model into the most basic elements: masses, springs, and dampers. In 1985, the ANIMA update to CORDIS allowed modeling of two- and three-dimensional elements. This system was the first to fully incorporate conditional links.

Csound

Csound is a music programming language for IBM-compatible, Apple Macintosh, Silicon Graphics, as well as several other computers. It was written by Barry Vercoe at the MIT Media Lab. The programmer is required to give Csound an “Orchestra” file using an infinite number of instruments and instrument parameters, and a “Score” file that may be equally as complex. Csound then creates a soundfile containing the completed work.

A typical Csound “Orchestra” specification using the Karplus-Strong pluck algorithm:

; timbre: plucked string
; synthesis: Karplus-Strong algorithm(15)
; PLUCK with imeth = 1 (01)
; pluck-made series(f0) versus
; self-made random numbers(f77) (1)

sr = 44100
kr = 441
ksmps= 100
nchnls = 1

instr 1;
iamp = p4
ifq = p5 ; frequency
ibuf = 128 ; buffer size
if1 = 0 ; f0: PLUCK produces its own random numbers
imeth = 1 ; simple averaging

a1 pluck iamp, ifq, ibuf, if1, imeth
out a1
endin

instr 2;
iamp = p4
ifq = p5 ; frequency
ibuf = 128 ; buffer size
if1 = 77 ; f77 contains random numbers from a soundfile
imeth = 1 ; simple averaging

a1 pluck iamp, ifq, ibuf, if1, imeth
out a1
endin

A “Score” written for this particular “Orchestra”:

; GEN functions
; “Sflib/10_02_1.aiff” should exist
f77 0 1024 1 “Sflib/10_02_1.aiff” .2 0 0 ; start reading at .2 sec
; score
; iamp ifq
i1 0 1 8000 220
i1 2 . . 440
i2 4 1 8000 220
i2 6 . . 440
e

Many composers working with physical models currently use Csound. The power of the program to control even the smallest nuance of a soundfile, as well as the ability to import sampled sounds, and the convenience of recycling sophisticated “Orchestras” and “Scores”, make it a powerful physical modeler.

Conclusion

Physical modeling is a very powerful form of synthesis. Each of the techniques outlined above, whether they be recreations of general sound-production mechanisms, or an attempt at an exact reference to one specific instrument, provides a composer with an opportunity to expand his or her sonic “palette.”  “It can be argued that often behind the use of physical models we find the quest for realism and naturalness, which is not always musically desirable. On the other hand we can notice that even with a physical model it is easy to obtain unnatural behaviors, by means of few variations of the parameters. Moreover, the acquired experience is useful in creating new sounds and new methods of signal organization” (DePoli 225).   Physical models provide insight into the function of the instruments that composers have been working with for centuries. Perhaps gaining a better understanding of acoustic instruments, as well as developing systems that can accurately model them, will enable electronic musicians to create more dynamic, more deeply evolving timbres than previously thought possible.

References

Adrien, Jean-Marie. 1991. “The Missing Link: Modal Synthesis.” Representations of Musical Signals. Cambridge, Massachusetts: MIT Press, pp. 269-297.

Borin, Gianpaolo, et al. 1992. “Algorithms and Structures for Synthesis Using Physical Models.” Computer Music Journal 16(4):30-42.

DePoli, Giovanni. 1991. “Physical Model Representations of Musical Signals: Overview.” Representations of Musical Signals. Cambridge, Massachusetts: MIT Press, pp. 223-226.

Florens, Jean-Loup, and Cadoz, Claude. 1991. “The Physical Model: Modeling and Simulating the Instrumental Universe.” Representations of Musical Signals. Cambridge, Massachusetts: MIT Press, pp. 227-268.

Keefe, Douglas H. 1992. “Physical Modeling of Wind Instruments.” Computer Music Journal 16(4):57-73.

Smith, Julius. 1986. “Efficient Simulation of the Reed-Bore and Bow-String Mechanisms. “Proceedings of the 1986 International Computer Music Conference.” San Francisco: Computer Music Association.

Smith, Julius. 1992. “Physical Modeling Using Digital Waveguides.” Computer Music Journal 16(4):74-91.

Woodhouse, James. 1992. “Physical Modeling of Bowed Strings.” Computer Music Journal 16(4):43-56.

Granular Synthesis (1995)

Introduction

“Granular synthesis is an innovative approach to the representation and generation of musical sounds” (DePoli 139). The conception of a granular method of sonic analysis may have been first proposed by Isaac Beekman in his article Quantifying Music (Cohen). This late Nineteenth Century document discusses the organization of music into “corpuscles of sound”. Unfortunately, granular synthesis theory was not investigated further for quite some time. British physicist Dennis Gabor stimulated new interest in granular synthesis around 1946 (Gabor). Gabor believed that any sound could be synthesized with the correct combination of numerous simple sonic grains. “The grain is a particularly apt and flexible representation for musical sound because it combines time-domain information (starting time, duration, envelope shape, waveform shape) with frequency domain information (the frequency of the waveform within the grain)” (Roads 144). Before magnetic tape recorders became readily accessible, the only way to attempt granular composition was through extremely sophisticated manipulation of a large number of acoustic instruments (as in many of the early compositions of Iannis Xenakis). The tape recorder made more sophisticated granular works possible. However, the laborious process of cutting and splicing hundreds of segments of tape for each second of music was both intimidating and time-consuming. Serious experimentation with granular synthesis was severely impaired. It was not until digital synthesis that advanced composition with grains became feasible.

Basics of Granular Synthesis

The grain is a unit of sonic energy possessing any waveform, and with a typical duration of a few milliseconds, near the threshold of human hearing. It is the continuous control of these small sonic events (which are discerned as one large sonic mass) that gives granular synthesis it’s power and flexibility. While methods of grain organization vary tremendously, the creation of grains is usually relatively simple. A basic grain generating device would consist of an envelope generator with a gaussian curve driving a sine oscillator (figure 1). The narrow bell-shaped curve of the gaussian fill is generated by the equation: The signal from the oscillator enters an amplifier that determines spatial position of each grain. Quadraphonic amplification is very popular for granular synthesis because of the great spatial positioning capabilities. The typical duration of a grain is somewhere between 5 and 100 milliseconds. If the duration of the grain is less than 2 milliseconds it will be perceived as a click. The most musically important aspect of an individual grain is its waveform. The variability of waveforms from grain to grain plays a significant role in the flexibility of granular synthesis. Fixed-waveforms (such as a sine wave or saw wave), dynamic-waveforms (such as those generated by FM synthesis), and even waveforms extracted from sampled sounds may be used within each grain. A vast amount of processing power is required to perform granular synthesis. A simple granular “cloud” may consist of a only a handful of particles, but a sophisticated “cloud” may be comprised of a thousand or more. Real-time granular synthesis requires an endless supply of grain generating devices. Several currently available microcomputers are capable of implementing real-time granular synthesis, but the cost of these machines is still quite prohibitive. Therefore, most granular synthesis occurs while the composer waits, sometimes for quite a while. This time factor prevents many electronic and computer composers from working with granular synthesis.

Methods of Grain Organization

Screens

One of the first composers to develop a method for composition with grains was Iannis Xenakis. His method is based on the organization of the grains by means of screen sequences (figure 2), which specify the frequency and amplitude parameters of the grains (FG) at discrete points in time (Dt) with density (DD) (DePoli 139). Every possible sound may therefore be cut up into a precise quantity of elements DF DG Dt DD in four dimensions. The scale of density of grains is logarithmic with its base between 2 and 3, and does not exist on the screens. When viewing screens as a two dimensional representation, it is important not to lose sight of the fact that the cloud of grains of sound exist in the thickness of time Dt and that the grains of sound are only artificially flattened on the plane (FG) (Xenakis 51). Xenakis placed grains on the individual screens using a variety of sophisticated Markovian Stochastic methods which he changed with each composition. The first compositions to use this method were Analogique A, for string orchestra, and Analogique B, for sinusoidal sounds, both composed in 1958-59. More recently, a variation on Xenakis’ screen abstraction has been implemented into the UPIC workstation discussed below.

Pitch-Synchronous Granular Synthesis

Pitch-synchronous granular synthesis (PSGS) is an infrequently performed analysis-synthesis technique designed for the generation of pitched sounds with one or more formant regions in their spectra (Roads 191). It makes use of a complex system of parallel minimum-phase finite impulse response generators to resynthesize grains based on spectrum analysis.

Quasi-Synchronous Granular Synthesis

Quasi-synchronous granular synthesis (QSGS) creates sophisticated sounds by generating one or more “streams” of grains (figure 3). When a single stream of grains is synthesized using QSGS, the interval between the grains is essentially equal. The overall envelope of the stream forms a periodic function. Thus, the generated signal can be analyzed as a case of amplitude modulation (AM) (Roads 151). This adds a series of sidebands to the final spectrum. By combining several QSGS streams in parallel it becomes possible to model the human voice. Barry Truax discovered that the use of QSGS streams at irregular intervals has a thickening effect on the sound texture. This is the result of a smearing of the formant structures that occurs when the onset time of each grain is indeterminate.

Asynchronous Granular Synthesis

Asynchronous granular synthesis (AGS) was an early digital implementation of granular representations of sound (figure 4). In 1978, Curtis Roads used the MUSIC 5 music programming language to develop a high-level organization of grains based on the concept of tendency masks (“Clouds”) in the time-frequency plane (DePoli 140). The sophisticated software permitted greater accuracy and control of grains. When performing AGS, the granular structure of each “Cloud” is determined probabilistically in terms of the following parameters:

1. Start time and duration of the cloud
2. Grain duration (Variable for the duration of the cloud)
3. Density of grains per second (Also variable)
4. Frequency band of the cloud (Usually high and low limits)
5. Amplitude envelope of the cloud
6. Waveforms within the grains
7. Spatial dispersion of the cloud

Obviously, AGS abandons the use of specific algorithms and streams to determine grain placement with regard to pitch, amplitude, density and duration. The dynamic nature of parameter specification in AGS results in extremely organic and complex timbres.

Some Recent Hardware and Software Developments

The UPIC Workstation

UPIC (Unite Polyagogique Informatique du CEMAMu) is a machine dedicated to the interactive composition of musical scores (Xenakis 329). It was conceptualized by Xenakis and created at the CEMAMu (Centre for Studies in the Mathematics and Automation of Music) in Paris. The UPIC software consists of pages on which a composer draws “arcs” which specify the pitch and duration of a sonic event (figure 5), and a voice editing matrix with which the “arcs” are described. Waveform, envelope, frequency and amplitude tables, modulating arc assignment, and modification of audio channel parameters (dynamic and envelope) may all be manipulated for each “arc” in real-time.

The hardware of the UPIC system consists of a Windows-based computer with a digitizing tablet, and the UPIC Real-Time Synthesis Unit:

64 Oscillators at 44.1 kHz with FM converter board:

  • 4 audio output channels
  • 2 audio input channels
  • AES/EBU interface

Capacity:

  • 4 pages of 4000 arcs
  • 64 waveforms
  • 4 frequency tables
  • 128 envelopes
  • 4 amplitude tables

The UPIC Workstation is ideal for granular synthesis for several reasons. First, it allows any waveform (including sampled waveforms) to be assigned to each “arc”. Second, it currently permits 64 “arcs” to be layered vertically. This enables the composer to design “clouds” of sound up to 64 grains in density and of infinite duration at any point in a composition. Finally, and perhaps most importantly, the UPIC requires no time to process any of its functions.

Csound

Csound is a music programming language for IBM-compatible, Apple Macintosh, Silicon Graphics, as well as several other computers. It was written by Barry Vercoe at the MIT Media Lab. The programmer is required to give Csound an “Orchestra” file using an infinite number of instruments and instrument parameters, and a “Score” file which may be equally as complex. Csound then creates a soundfile containing the completed work.

A typical Csound granular “Orchestra” specification:

;;; granulate.orc

sr=44100
kr=22050
ksmps=2
nchnls=1

instr 1
next: timout 0,p6,go1 ;;; p6 = grain duration time… I could allow for an envelope on this
reinit go1
timout 0,p5,go2 ;;; p5 = inter-grain time… I could allow for an envelope on this
reinit next

go1:
k1 oscil1i 0,1,p6,3
a1 soundin p7,p4,4 ;;; p7 is which soundin file to use…
a2 = a1 * k1
rireturn
go2:
k2 oscil1i 0,1,p3,4 ;;; envelope output sound.
out a2*k2
endin
;;Copyright 1992 by Charles Baker

A “Score” written for this particular “Orchestra”:

;; sample .sco file
f 3 0 8193 9 1 -.5 90 0 .5 90 ;; grain envelope
f 4 0 8193 9 1 -.5 90 0 .5 90 ;; Note amplitude env.
;;ins st dur amp inter-grain-time grainduration soundinfile#
i 1 0.000 2.750 1 0.000 0.020 1
i 1 2.750 2.612 1 0.010 0.020 1
i 1 5.362 2.482 1 0.020 0.020 1
i 1 7.844 2.35
i 1 0.030 0.020 1
i 1 10.202 2.240 1 0.04 0.020 1
i 1 12.442 2.128 1 0.05 0.020 1
i 1 14.570 2.022 1 0.06 0.020 1
i 1 16.591 1.920 1 0.07 0.020 1
i 1 18.512 1.824 1 0.08 0.020 1
e
;;Copyright 1992 by Charles Baker

Many granular composers currently use Csound. The power of the program to control even the smallest nuance of a soundfile, as well as the ability to import sampled sounds, and the convenience of recycling sophisticated granular “Orchestras” and “Scores”, make it a powerful granular synthesizer.

Cloud Generator

Cloud Generator is a granular synthesis application for the Apple Macintosh (figure 6) . The software was conceived and programmed by Curtis Roads and John Alexander at Les Atelier UPIC in Paris. Cloud Generator creates clouds using Quasi-Synchronous or Asynchronous Granular Synthesis based on the parameters listed in that section on AGS above . Each QSGS stream and AGS “Cloud” must be created individually and is output in AIFF format.

Conclusion

Granular synthesis is a very powerful means for the representation of musical signals. Each of the techniques outlined above provides an opportunity for a composer to expand his or her sonic “palette”. Asynchronous Granular Synthesis is a particularly powerful means for creating sonic events that are both unique and sophisticated. “In musical contexts these types of sounds can act as a foil to the smoother, more sterile sounds emitted by digital oscillators” (Roads 183). When granular synthesis techniques are used in conjunction with sampled waveforms, the possibilities for new sounds are infinite.

References

Cohen, Michael, ed. Isaac Beekman. Dordrecht, The Netherlands: D. Reidel, 1990.

DePoli, Giovanni, ed. Representations of Musical Signals. Cambridge, Massachusetts: The MIT Press, 1991.

Gabor, Dennis. “Theory of Communication.” Journal of the Institute of Electrical Engineers Part III, 93: 429-457.

Roads, Curtis. “Asynchronous Granular Synthesis.” Representations of Musical Signals. Cambridge, Massachusetts: The MIT Press, 1991.

Strange, Allen. Electronic Music: Systems, Techniques and Controls. Dubuque, Iowa: W.C. Brown Company, 1983.

Truax, Barry. “Real-time granular synthesis with a digital signal processor.” Computer Music Journal 12(2): 14-26.

Xenakis, Iannis. Formalized Music. Stuyvesant, NY: Pendragon Press, 1991.