In part 3, we had a first look at creating audio effects in AS3 by processing microphone input with robot voice effects. One of the things we did was to create a so-called comb filter by adding a delayed version of the original signal to itself. We then created a flanging effect by oscillating the delay’s offset.
Audio sample: 2 drum samples, one dry, one processed with a flanger.
Given that flangers are one of the staples of audio effects (especially when applied to electric guitars), you may have been wondering why exactly this worked the way it did. In this article we’ll figure this out.
This gives us an opportunity to review a bit of audio theory, independent of AS3. Next time, we’ll redo our quick and dirty flanger implementation and add antialiasing, but today’s article will be free of code – you’ll just need a little high school math.
There are a handful of basic audio concepts that we’ll need to know in order to understand flanging. Chances are that you’re already familiar with these, and any of them could be the focus of a detailed discussion in itself – but for our purposes the following summaries should hopefully get you up to speed:
1.) The most basic type of sound, a periodic sine wave, is completely determined by three parameters: its amplitude, its frequency and its phase.
The amplitude is the height of the peak of the wave. This is related to the sound’s perceived loudness, but not quite the same thing (for example, the human ear places different emphasis on different frequencies).
The frequency is a measure of how often the wave repeats in a given timespan. If the timespan is a second, the unit of frequency is a Hertz (or Hz). The frequency of a sound wave determines its pitch.
The phase is the position within the wave’s cycle at a specific point in time (e.g. at the start of the recording). As a sine wave repeats itself every 2*PI radians, you can think of a phase of 2*PI as the same position in the wave as a phase of 0, 4*PI, 6*PI, and so on. You can think of changing the phase of a sine wave as shifting the whole wave to the left or right.
The value of a sine wave at any given time is y(time) = sin( 2*PI*frequency*time + phase )*amplitude.
This should be pretty straightforward: multiplying the frequency by two makes the whole wave repeat twice as fast. Multiplying the amplitude by two doubles the height of the wave’s peaks. Changing the phase by adding or subtracting an offset shifts the wave left or right. The amount of shifting for any given offset is dependent on the frequency, i.e. at higher frequencies, the same offset will shift the wave by a shorter distance in time.
2.) Any sound of finite length can be seen as a sum of basic sine waves. Take any periodic waveform at all: you can decompose it into a series of sines, each with its own frequency, phase and amplitude.
This is the basis of the Fourier transform. The Fourier transform takes a signal that is in the time domain (i.e. a sound wave) and decomposes it into a series of sine and cosine waves of increasing frequency (note that a cosine is just a sine wave with a phase shift of PI/2). For each sine and cosine, it gives us the appropriate amplitude and phase. Add these waves together, and you get back the original sound wave.
Your favorite audio editing software probably has two ways of displaying sound data: a waveform view and a spectral view. This spectral view (called a “spectrogram“), which shows you the intensity of the sound at different frequencies, is the result of applying the Fourier transform to successive slices of the audio material.
3.) Suppose you play two sine waves of the same frequency at once: what you get is constructive or destructive interference. Think of the values of the sines as going from -1 to 1, and think of two sines that are played at once as being added together: If the phases of the sines (that is: the position in their cycle that they are in) are identical, the peaks of the sines are added, resulting in a total output that goes from -2 to +2.
If the phases are opposite to each other (which is to say they are PI radians apart), the peaks and troughs of the two waves cancel each other out (because when one is at peak +1, the other is at -1 and vice versa), producing a total amplitude of zero. Anything in between and you get a result that’s somewhere in between. If the frequencies are not exactly the same, you get an effect called “beating”, where constructive and destructive interference are alternating.
4.) Finally, it’s worth pointing out that humans are good at hearing the frequency of a sound wave, but very bad at distinguishing phase. To illustrate this, think of a piano note. Now think of the same piano note again, but played at a starting position offset by maybe a millisecond. All this offset is, is a phase change (and a big one!) – and you wouldn’t hear it at all.
So what’s a comb filter?
Audio sample: 2 drum samples, one dry, one processed with a comb filter.
In last week’s article we discovered that adding a delayed version of a signal back to itself results in a metallic sounding effect. In audio processing, this effect is called a “comb filter“, the name referring to its characteristic frequency response.
You can think of the frequency response of a filter as a measure of how the filter will attenuate or amplify a signal’s parts at different frequencies.
For instance, a low pass filter will gradually reduce the power of the source signal’s components above the cutoff frequency:
The frequency response of a comb filter is characterized by a series of regularly spaced spikes, which look a bit like the teeth of a comb. The distance between these spikes depends on the length of the delay. Shorter delay times cause the spikes to be spaced farther apart.
A comb filter can be created by taking the delayed version of the signal and adding it to the original, or it can be created by subtracting it from the original (or adding the inverse of the original, which is the same thing). The only difference between the two is that the additive comb filter’s peaks are placed at the frequencies of the subtractive comb filter’s troughs and vice versa.
So why does a mere addition of the delayed signal result in such an interesting frequency response and, by extension, such an interesting sounding effect?
…and why does it work?
To answer this question, let’s look at what happens to sine waves of different frequencies when we delay them by the same amount of time:
Again, the formula for a sine wave is y(time) = sin( 2*PI*frequency*time + phase )*amplitude.
To make things easier, let’s assume a constant amplitude of 1, and discard the amplitude scalar at the end of the equation for the rest of this discussion. Let’s also assume that the unit of time is seconds. If we set the frequency of the wave to 1.0, then the sine wave will repeat every second, that is to say, at the end of each second the term 2*PI*frequency*time + phase will be equal to an integer multiple of 2*PI, plus the original phase of the wave.
Now suppose that we subtract a constant offset from the time variable, as would be the case if we sent this waveform through a delay line: The term inside the sine operation now becomes 2*PI*frequency*(time-offset) + phase.
With a bit of basic algebra, we can restate this as 2*PI*frequency*time + (phase – offset*2*PI*frequency).
Think about this for a second: What this means is that subtracting a constant time offset from a sine wave (i.e. adding a delay) is the same as shifting the wave’s phase by a value depending on both this offset and the wave’s frequency!
Take a wave of frequency 1.0: its value at any given point in time is sin(2*PI*1*time + phase). Now delay the wave by 0.5 seconds: its value is now sin(2*PI*1*time + phase – 0.5*2*PI*1), which is equal to sin(2*PI*time + phase – PI) or a phase shift of -PI radians.
Now take a wave of frequency 2.0 and delay it by the same amount of time, 0.5 seconds: its value is now sin(2*PI*2*time + phase – 0.5*2*PI*2), which is equal to sin(2*PI*2*time + phase – PI*2) or a phase shift of -PI*2 radians.
The thing to note here is that adding a sine wave to a phase shifted copy of itself will result in constructive interference, if the two waves are 2*PI radians apart (the resulting wave will then go from -2 to 2, instead of -1 to 1), and complete annihilation if the two waves are PI radians apart.
With that in mind, let’s look at a simple table of how much phase shift is introduced if a wave of a given frequency is delayed by 0.5 seconds:
|frequency:||0.5 seconds delay:||resulting phase shift in radians:|
|0.5||sin(2*PI*0.5*time + phase – 0.5*2*PI*0.5)||-PI*0.5|
|1.0||sin(2*PI*1*time + phase – 0.5*2*PI*1)||-PI|
|2.0||sin(2*PI*2*time + phase – 0.5*2*PI*2)||-PI*2 = 0 (-> sin(2*PI) == sin(0))|
|3.0||sin(2*PI*3*time + phase – 0.5*2*PI*3)||-PI*3 = -PI|
|4.0||sin(2*PI*4*time + phase – 0.5*2*PI*4)||-PI*4 = 0|
|5.0||sin(2*PI*5*time + phase – 0.5*2*PI*5)||-PI*5 = -PI|
What this table shows is that the same time delay of 0.5 seconds causes phase shifts in sine waves, with the intensity of the shifts periodically varying with increasing frequency. Any sine wave of an integer odd frequency will get shifted by -PI, any sine wave of even frequency will get shifted by a multiple of 2*PI, which is the same as no shift at all. Any frequencies in between will be phase shifted by an amount between 0 and -PI.
Since we know that phase shifted sine waves either cancel each other out or amplify one another, and we also know that any complex sound can be seen as composed of basic sines, it should now become clear why adding (or subtracting) a delayed version of any waveform to itself results in a comb filter:
The waveform’s constituent sine and cosine waves are phase-shifted by the delay, and the amount of shifting depends on the waves’ individual frequencies, so some frequencies in the original signal will be wiped out, while other frequencies will be amplified. To see how this is dependent on the duration of the delay, simply take a look at the above table and check what happens if you substitute 0.1 or 2.0 for the 0.5 second delay we have been working with so far.
Flangers and chorus effects
The spacing of the spikes in a comb filter’s frequency response depends on the duration of the delay that is used to create the filter. A flanger simply varies the duration of the delay over time, typically with a low frequency sine or triangle wave. This results in the peaks and troughs of the filter wandering across the spectrum, such as in the following illustration:
This is your basic flanger: a single delayed version of the signal, with the delay offset being manipulated by a low frequency oscillator (LFO). Your favorite audio software’s flanger implementation usually offers control over the intensity of the effect (i.e. a scalar with which the delayed copy of the signal is multiplied before being added back), the duration of the delay as unmodified by the LFO, and the frequency and depth of the LFO.
More advanced flanger implementations may also let you choose between positive and negative comb filters, they may let you feed the modified signal back into the delay line (with a parameter controlling the amount of feedback), or they may let you filter the delayed signal (applying low pass or high pass filters, for example), so as to only allow a certain frequency range to be affected by the flanger effect.
Another neat trick that is easy to implement is to perform stereo flanging, by having the phase of the LFO used to flange the left channel be opposite (i.e. PI radians apart) from the right channel’s.
Chorus effects are basically the same as flangers, but with a much longer delay.
Audio sample: 2 guitar samples, one dry, one processed with a chorus effect.
The idea behind a chorus effect is that having a second version (or several versions) of the same sound, which is slightly out of tune, will provide a richer timbre. Well, what happens when you take a delayed copy of a sound and have the delay length oscillate between two values? That copy speeds up and down periodically, which means that its pitch oscillates as in a vibrato effect! Add that version back to the original, and you get an effect not unlike two voices singing the same track in unison – the delay length just has to be long enough so that there are no noticeable flanging effects.
The chorus effect can be heard on a lot of vocals and guitars throughout the ages (I think the 80s were particularly fond of the effect). Just as with flangers, your typical chorus lets you set the depth of the delay, the depth and frequency of the vibrato effect, and the dry/wet mix of the signal. Advanced implementations will also let you set the polarity (adding or subtracting the original signal) of the effect and possibly let you filter the delayed portion of the signal so as to exclude parts of the frequency spectrum. Again, as with flangers, it’s easy to implement a chorus effect in a manner so that one stereo channel can receive a different delay offset than the other.
Shameless PlugCheck out my upcoming 80s-cartoon-themed space opera "Ace Ferrara And The Dino Menace"!