In this second installment of my series of tutorials on dynamic sound in ActionScript, I’ll discuss the different parts of the sound API and show you how to extract single samples from a sound that’s in memory or coming from the microphone, as well as how to generate simple dynamic audio in real time.

As I wrote in part 1, the sound API introduced in Flash Player 10 is essentially just a set of methods and classes that let you access individual samples in a sound. Just as the introduction of the BitmapData class enabled you to manipulate and read pixels in bitmap images and from a webcam’s input, the sound API lets you process sound in memory or coming from microphone input, at the sample level.

There are two basic ingredients you need to understand in order to cook up dynamic audio – ByteArrays and SampleDataEvents:



ByteArrays are the format in which you’ll be dealing with the raw data. Whenever you read the samples contained in an audio clip, write samples to the sound output or receive new samples from the mic, you’ll be working with a ByteArray instance.

If you’ve never had to deal with ByteArrays in AS3 before, don’t fret – in practice you’ll usually just read and write Numbers from and to them, or convert them to Vector.<Number>s to do any advanced work.

A ByteArray is basically an array of values of larger types (such as Numbers or even serialized Objects), sliced up into bytes and packed together into one big stream. ByteArrays store an index into the data (called the position), which is automatically incremented whenever you read from or write into the stream. So, for instance, if you have a ByteArray and you call its readFloat() method, you’ll get back a 32 bit (= 4 byte) Number value, giving a float representation of whatever the next 4 bytes in the stream contain, and the position is then incremented by 4. The next time you call readFloat(), the next floating point Number value will be returned.

A ByteArray in Flash is essentially a stream in which arbitrary data can be stored together. Use methods like readFloat(), readDouble(), etc. to extract contents at the current position.

ByteArrays allow you to work very close to the metal (by AS3′s standards), and they provide a bit of additional useful functionality, such as data compression. For our purposes, though, all you should need to know is that you can write the next sample into a ByteArray by using writeFloat(), and while the position of a ByteArray is smaller than its length (which is also given in bytes), you can always call readFloat() to extract the next sample.

Let’s look at a quick example to show you how this works in practice:

Suppose you have an instance of class Sound, called sound, and suppose you also have an instance of class ByteArray, called data (which you can simply create with data = new ByteArray(); ):

sound.extract(data, sound.length*44100); // writes the sound’s samples into the ByteArray

The first parameter of Sound.extract() is the ByteArray into which you extract the sound’s sample data. If the ByteArray already has data, the extract() method starts overwriting this data at the ByteArray’s current position and appends where needed. The second parameter of Sound.extract() specifies the number of samples to extract. A sample is two 32-bit floating point Numbers, one for each stereo channel. With the exception of mic input, when you’re dealing with dynamic audio in Flash, you’re always dealing with 44.1 kHz stereo audio, given in normalized (i.e. full amplitude == 1.0) 32-bit floating point. Therefore, if sound.length gives you the length of a sound in seconds, sound.length*44100 gives you the number of samples there are in the complete Sound instance.

So, suppose you’ve extracted a sound’s sample data in this manner. How do you access it?

data.position = 0; // reset the ByteArray's index pointer
while (data.bytesAvailable) // while there's more data at the ByteArray's current position:
  var leftSample:Number = data.readFloat();
  var rightSample:Number = data.readFloat();
  trace("Sample Data: "+leftSample+" (left), "+rightSample+" (right).");

This will continue reading from the ByteArray as long as more data is available. Each iteration of the loop, two 32-bit values are read from the ByteArray, one for each stereo channel. In this case, we simply trace them out (note: as this has 44.1K trace calls per second of audio, it will probably cause trouble for sound files that aren’t super-short!), and again note how they stay between -1 and 1, as 1 is the maximum amplitude of a sample!

If we were to do a lot of processing on the sound samples, especially if the operations we performed took multiple input samples into account for any given output sample, we would probably push these into two Vector.<Number> instances, like so:

data.position = 0; // reset the ByteArray's index
var leftChannel:Vector.<Number> = new Vector.<Number>();
var rightChannel:Vector.<Number> = new Vector.<Number>();
while (data.bytesAvailable) // while there's still data in the ByteArray:



A pre-loaded Sound, such as one instantiated from the library, is already completely in memory, which is why we can extract its complete data at once. With sound that’s coming from the microphone, new samples come in all the time, so we’ll need to process them as they arrive. This (and the creation of dynamic audio output) is where the SampleDataEvent class comes in.

Suppose you want to capture incoming sound data from the microphone, and append it to the leftChannel and rightChannel vectors we instantiated in the last section. Note that these are Vector.<Number> objects, not ByteArrays:

var mic:Microphone = Microphone.getMicrophone();

mic.rate = 44; // sets the mic's sampling rate to 44.1 kHz (note: not 44.0!)
// [...] set various attributes for the microphone, such as echo suppression, etc.
mic.addEventListener(SampleDataEvent.SAMPLE_DATA, micSampleDataHandler);

function micSampleDataHandler(event:SampleDataEvent):void
  // is a ByteArray containing sample data!
    var sample:Number =;
    // microphone input is mono!

What’s happening here is that we have an instance of type Microphone dispatching SampleDataEvents, whenever new data comes in from the mic input. This new data comes in the shape of a ByteArray, again with 44.1 kHz 32-bit data (you can actually specify one of a few different possible sampling rates for the mic, however if you want to use the microphone input in dynamic playback, it’s prudent to use the same rate as the sound output will have). Whenever a SampleDataEvent happens, we read in the data as long as there’s more, and append it to the two Vector.<Number> instances we created somewhere outside. Note that input always comes in mono format, so we append the same sample to both channels.


Sound output!

If we’re analyzing a Sound in memory, we simply extract its sample data into a ByteArray. If we analyze sound coming from a microphone, we listen to the SampleDataEvent. With each new SampleDataEvent, new sample data from the microphone is coming in, and we can take this data and append it to a Vector, which will then contain a recording of all microphone input since its instantiation.

On the opposite end, dynamic sound output uses SampleDataEvents as well! The way this works is that you instantiate an empty Sound object, add a SAMPLE_DATA event listener and then call play() on the instantiated sound. The Sound will dispatch the SAMPLE_DATA event whenever it needs more samples to continue playing, at which point your event handler should supply anywhere between 2048 and 8192 new samples (remember that, strictly speaking, a single sample means two floating point Numbers, one for each stereo channel!). If you supply less than 2048 samples, the sound assumes that it is supposed to finish. It plays the last samples you added and then stops.

The number of samples your SAMPLE_DATA event listener generates every time it is called is called the buffer size. A small buffer minimizes the latency, at the cost of increased risk of buffer underflows (the audio stuttering that happens if the application can’t generate enough samples to keep up). As I pointed out in the last article, your latency is unavoidably going to be horrible on some platforms anyway, so in most cases, there’s not much point in sticking to the lowest possible buffer size at all cost. Note that the buffer size doesn’t need to be a power of two but can be any number in range 2048…8192.

With all that out of the way, let’s take a look at some sample code! The following piece of code simply generates stereo noise (at full volume – turn down your headphones!):

var sound:Sound = new Sound();
sound.addEventListener(SampleDataEvent.SAMPLE_DATA, soundSampleDataHandler);;
function soundSampleDataHandler(event:SampleDataEvent):void
  // We can add an arbitrary amount of new sample data to, anywhere between 2048 and 8192.
  // Anything less than 2048, though, and the sound will play until the end and then stop!
  for (var i:int = 0; i<2048; i++)
  {*2-1); // random float -1...+1 for left channel*2-1); // random float -1...+1 for right channel

After all we’ve discussed, this should be fairly straight-forward: Whenever the sound buffer is empty, soundSampleDataHandler() is called. This event handler then proceeds to create 2048 samples, each of which contains two random numbers (one per channel), between -1 and +1. These are written to the ByteArray in the SampleDataEvent handled by soundSampleDataHandler().

Suppose we didn’t want to produce random noise; instead, let’s look at what we’d need to do in order to produce a single A1 sine tone at the standard pitch of 440 Hz:

var sound:Sound = new Sound();
sound.addEventListener(SampleDataEvent.SAMPLE_DATA, soundSampleDataHandler);;

var currentPhase:Number = 0;
var deltaPhase:Number = 440/44100;

function soundSampleDataHandler(event:SampleDataEvent):void
  for (var i:int = 0; i<2048; i++)
    currentPhase += deltaPhase;
    // note: this is unoptimized – normally you'd multiply deltaPhase by Math.PI*2 and remove that part here!
    var currentValue:Number = Math.sin(currentPhase*Math.PI*2);;;

A sine wave has a period of 2*PI, and a 440 Hz tone means that each second the resulting wave needs to repeat 440 times. 440*2*PI divided by the sample rate (=44100) gives us how much we need to increase the phase of the signal per sample.

This is pretty much the basis for the common oscillators you’ll encounter in a software synth, except that it’s usually simpler to have your phase go from 0 to 1 (instead of to 2PI): For a sine wave, you’ll take the sine of the phase * 2PI. For a square wave, you’d simply check if the phase%1 is above or below 0.5 and set 1 or 0 accordingly. From there on out, you should have no trouble creating triangle or sawtooth waves!

In this tutorial, we’ve discussed all major parts of the AS3 dynamic sound API. In the next installment we’ll put it all together and look at how to manipulate microphone input and turn your voice into a horrible robot! Sounds like fun? Stay tuned!


7 Responses to Realtime audio processing in Flash, part 2: The basics

  1. [...] processing in Flash in this multi-part tutorial series. In part two, he covers the basics including understanding ByteArray and the SampleDataEvent. While in part three, he creates a sample application that will convert your voice into a robot [...]

  2. Awesome tutorial so far!! Thanks for the pain.

  3. sonicoliver says:

    fantastic… this is the tutorial I’ve been searching for for the past week!!!

  4. Rob says:

    When you run this code to produce a simple sinusoidal wave, you hear some high pitch noise in addition to the defined frequency (440). This din’t happened before in earlier compilations of swf’s. See the function generator on my website. Its nice and clean. But if I recompile that same program using a new version of flash, you hear the noise even at 1hz. Anyone else having this problem?

  5. Amitesh says:

    Best tutorial about Sound, ByteArray and Sample_Data event

    Thanks for detailed explanation.

  6. diego says:

    very clear. thanks for share

  7. BUCUR RADU says:

    Thx. Best tutorial from the internet about sound in flash

Leave a Reply

Your email address will not be published. Required fields are marked *


eight − = 2

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>