In this series of advanced ActionScript tutorials, I’ll give some practical examples on how to work with the sound API introduced in Flash Player 10 to process audio in real time (filtering, adding effects, etc.) or synthesize sound from scratch. My goal is to evolve this into a series of articles that starts at the very beginning but goes a good deal farther than other AS3 audio tutorials, which tend to explain the API and then stop there, as if the rest were easy to figure out on your own.
The current roadmap is as follows (I’ll edit this if plans change):
Part 1 gives an overview of what the API is and what it can and cannot do.
Part 2 will explore the different parts that make up the API.
In part 3, we’ll put all of that to use and create a simple effect that takes input from a microphone and turns it into a robot voice.
In part 4, we review a bit of audio theory and explain how flangers, comb filters and chorus effects work.
In part 5, we’ll discuss interpolation by looking at pitch shifting implementations.
In part 6, we’ll start working on a sound manager that will allow you to mix and seamlessly string together pieces of a song. In a Flash game, this would give you the ability to create much less repetitive music while conserving file size.
In the installments following that, we’ll extend the sound manager with a flexible effect architecture. To think of what you’ll be able to do with this, picture a Flash game in which entering a cavern would add a low pass filter and reverb to the game music. Finally, at the end of the series, I’ll give you some pointers on how to write your own software synthesizer in Flash.
In between, I’ll probably also write some posts to explain some basic audio concepts or show you how to write specific effects, so if that’s of interest to you, it might be worth keeping an eye out for updates even if you’re not particularly interested in AS3 development. Once we get the basics out of the way, the concepts are pretty independent of language or API.
The sound API
Back in the olden days, all that we could do to make some noise in ActionScript was to instantiate Sound objects from an SWF’s library, or load or stream them from the web. There were a few things one could tweak about a Sound instance, such as setting its volume and pitching or panning it, but the overall scope of what you could do with audio in Flash was very limited.
A few very smart folks came up with hacks that allowed dynamic audio processing by manipulating SWFs containing sound objects in memory (I think that’s how it worked – I didn’t pay as much attention at the time, so I confess I’m fuzzy on the details). These workarounds didn’t work consistently across player versions and platforms though, prompting the inception of the “Adobe, MAKE SOME NOISE” campaign. Adobe listened and added dynamic sound in Flash Player 10.
The introduction of the new sound API in ActionScript was a milestone similar to when Adobe first added the ability to directly manipulate pixels in a BitmapData: for most people it made little difference, but for those who like or need to work close to the metal, it opened up a world of possibilities to explore.
Granted, just as people still position Sprites and Bitmaps on a stage, you most likely won’t be abandoning Sound.play() any time soon. For most use cases the conventional method of instantiating and playing Sound objects will be sufficient, and it comes with less latency and less overhead, both in CPU intensity and development time.
What it can do
Just as the introduction of BitmapData objects enabled us to access an array of pixels that make up a bitmap image, the sound API enables us to access an array of samples. These samples are either part of a pre-loaded sound, part of a sound that is playing right now (so we can manipulate and change the sound output), or part of the sound input coming from a microphone.
That’s it. You basically get the equivalent of reading pixels in a pre-loaded bitmap, writing pixels to a bitmap in memory and reading pixels from a webcam. You’ll have to code any fancy processing yourself – and that includes even the basics that have already been (and continue to be) possible with Sound and SoundChannel objects, such as panning or changing the volume of the sound samples.
The sound API lets you process and synthesize sound in pretty much any manner you can imagine – you’ll just have to write the necessary ActionScript code yourself.
For an amazing example of what can be done, check out the Hobnox Audiotool!
What it can’t do
There are three reasons why:
No MIDI – ActionScript still doesn’t support MIDI input. To my understanding, Hobnox Audiotool gets around this by actually using a Java Applet for MIDI support and relaying the MIDI input to the Flash application via a local connection. It’s a wonderfully resourceful solution, but it relies on Java applet support (which is not always present and often broken) and apparently needs the user to redo the MIDI mapping every time they open an arrangement.
Audio processing is CPU intensive – dynamic audio in Flash is always a 44100 Hz stereo stream of 32-bit floating point data (in Flash this means the samples are of type Number). In the best case, you’ll be calculating 88200 numbers per second in ActionScript, and ActionScript is horribly slow. For any non-trivial application this means you’ll be doing some serious optimization of your code, and even then, there’s a limit to what you’ll be able to achieve.
Latency – the single biggest issue with the sound API in Flash, however, is latency. Latency is the delay between triggering an audio event and the event actually emanating from your speakers. For any live input (e.g. playing notes on a keyboard), you’ll want to keep latency as low as possible, probably somewhere below 10 milliseconds and definitely below 30 (for comparison, in a 160 BPM song, a 16th note is ~94ms long).
The most obvious contribution to latency in digital audio usually comes from the sound buffer size. The sound API lets you set this anywhere between 2048 and 8192 samples, so right off the bat your minimum latency will be 2048 samples at 44100 samples per second, amounting to 46ms. That, however, is just the start: according to Adobe’s Tinic Uro, the Flash player has an internal buffer, which adds 200 to 500 milliseconds latency (note that the same article gave the minimum buffer size as 512, but that information is outdated). That’s a fifth to half a second delay between the moment you hit a note on your keyboard and the moment you hear that note on your speaker. To be honest, I’m not even sure if that covers the actual latency of the sound card driver – which is traditionally horrible on Windows systems – or if that is added on top of that. I do however know that the sound API’s latency can be very different across platforms. While it seems to be a lot less noticeable on Mac, we’ve had huge delays on some Windows XP machines.
Either way, writing a software instrument in Flash that reacts to a skilled player’s live input is out of the question (unless the latency issues are fixed in future Flash players).
Writing something like ReBirth on the other hand is entirely possible. If your synthesizer is controlled by turning notes on and off in a sampler type user interface, latency problems become negligible. And while you probably wouldn’t want a Flash game’s sound effects to go through a significant delay, you could still have its music be affected by in-game events in interesting ways without a 200ms lag becoming too much of a problem.
Shameless PlugCheck out my 80s cartoon space operetta "Ace Ferrara And The Dino Menace"!