Here is an article that was published some time in 2004 for a Demo Scene ‘Zine online, and possibly even in print. I can’t remember exactly what it was, and can’t find it online now. If anybody knows, or knows a Lonnie Taylor (editor for the publication that I submitted this to), please let me know, as I’d love to read the whole issue again. I believe it also had some other contributions from fellow Northern Dragons members.
After the release of the 4K demo Etherium that Northern Dragons entered into Assembly 2003, I got a few emails from people asking me for some details about how we accomplished the audio. The intention of this article is to provide a bit of insight into what we did, and hopefully answer some of those questions. You can check out the demo on Pouet at http://www.pouet.net/prod.php?which=10569, and visit Northern Dragons at http://www.northerndragons.ca
Our overall game plan, to leverage the skills of the people involved, was to create a complete sound engine module in C++, and have it converted to NASM and optimized afterwards. Since it seems that a lot of the DirectSound examples available online are in C++, it was a lot easier to figure out what we were doing that way. That, and I just felt a lot more comfortable doing a bunch of ugly trig in C++.
The playback engine is just a bunch of DirectSound intialization stuff that you get to by including DirectSound header files and libraries from .dll files, and defining some data structures. As a musician/math guy, it was nice to have a framework for all the initialization details created for me in C++, and then let the coders take what I had and put it into assembly. Taking that idea one step further, you can also use a framework like that to let different people work on the math, creating the sound palette, and someone else write the actual music. It’s all about division of labour, and making it a solid team effort.
The main thing I was working with was a huge ‘buffer’ in memory that you can think of as a wave. I don’t remember the numbers, but the track was about 2 minutes long, 16-bit (2 bytes) mono, at 22050 samples/second, so it’s roughly (2 bytes/sample)*(22050 samples/sec)*120 seconds = 5292000 bytes = ~5 MB. That’s a contiguous chunk of memory you can access like an array, and they’re all zeros to start with. When you’re done building into that, you just pass a pointer to it to the DirectSound engine, and say ‘here, play this’. In our case, we said ‘play this with stereo reverb’ (specifically, Direct Audio DSFXWavesReverb), so it sounds a little less dry. You can really notice it in the kick sound.
So then, you need your sounds. I built this whole thing with functions…calling functions…calling functions. The lowest level of function was just an equation for each sound. You pass it some stuff like a frequency, an amplitude, and a start position (your intial array index to write to the buffer). The equations are pretty basic trig functions, and I won’t really get into the details. They’re mostly one-liners (but ugly mothers) in a for loop. Believe it or not, the lead sound is almost a pure sine wave. The amplitude envelope is as simple as a quarter sine wave itself. Draw the quarter sine wave that goes from 1 down to 0, multiply the amplitude by that, and you get a really cheap non-linear roll-off. I also figured out how to do an inverse exponential roll-off just by taking the amplitude, and multiplying it by 0.999999 over and over, which I used on the whooshing sound.
One of the things I strongly urge people to do is use an initial volume ramp on their sounds. It’s the biggest thing that I hear other people missing in this sort of work, and it makes a huge difference. Those initial clicks you hear on every note in some productions…well, they can drive your audience crazy, and can’t be great for speakers either. Here’s a code snippet to show what I mean, and how easy this is:
value *= (amp/50*count);
value *= (amp*envelope_function(count)));
If you think about it, (50 samples)/(22050 samples/sec) works out to around 2 milliseconds. Enough to eliminate a click, but not enough to really damage your attack transients, unless you’re being really picky.
The kick drum was the cheapest thing ever. It’s a sin wave that drops from (I think) something like 60Hz down to 40Hz, with that quarter sine rolloff I was talking about. Dirt simple.
The whooshes were, as I said, generated with the random number generator. I just kept increasing the rate at which it chose new numbers for my samples, so the pitch of the noise goes up. The flanger took care of smoothing out the ‘burbling’ you would otherwise hear at the start. A real bonus here was the fact that we already had a random number generator for other areas of the production, so it turned out to have multiple uses, increasing our bang-per-byte.
The chords were a little trickier. I don’t even remember what the final version ended up being, but I think it was a sawtooth, with the points of the teeth flattened into something square-like, so it didn’t sound quite so harsh. The math for that, while not involving trig, was one of the hardest parts for me to get right.
A few words about effects. If you have one sound you want to put a static delay on, like our lead sound, it’s simple to incorporate the delay right into the sound, just by writing to 3 or 4 further offset locations in the sound buffer, with a scaled down amplitude for each. It makes the code pretty tight, and then you don’t need separate delay functions, or have to worry about using a DirectAudio delay effect that turns your whole mix into a jumbled mess.
I did write a separate flanger function, just because the code was pretty cheap and simple, got reused in a few places, and ended up being smaller than it would have been to turn on a DirectAudio flanger. When you’re building these effect functions, think hard about what parameters you want to control. My flanger had start and end times, depth, rate, and mix amount. Knowing which settings can change between calls, and which are constant can help a lot with optimization later, when every byte counts! Something else to think about for the conversion to assembly: parameters are cool, but more challenging for the folks coding it to assembly. Setting up the assembly routines to accept a pointer to a datastructure that holds all the values for a given effect is a cheap and effective way to create a simple music scripting technology. ’nuff said!
Then you have a melody block function that calls the sounds with the right frequencies, with a bunch of offsets (plus a master offset). Call that block a bunch of times in a loop, where the master offset increases, and you generate that melody block over and over. You build the entire thing with blocks built on blocks, and the whole thing becomes a lego exercise from that point. I think that part is where all the musical creativity goes, and I leave that as an exercise for the reader!
Hopefully that gives you some ideas, whether you’re considering doing this for the first time, or even if you’re a veteran in this area. Either way, I wish you tons of success, and look forward to having my socks knocked off!
Chris Deschenes (umdesch4)
Northern Dragons Audio Lead