A Brief Discussion of Audio File Types
This article first appeared Recording Magazine. I reprint it here with permission, and I encourage you to subscribe to that publication, as they are a stand up bunch of folk!
In my article “Keeping Track”, we covered data. We talked about the information you need to keep with your songs in order to sell, license and organize them. We covered metatags; data about data that gets embedded in files. We talked a little about the file types that carry metadata and how to use them, and that brought up a wider topic: audio file types.
There are hundreds of audio formats and an endless variety of settings and options. So, without a whole lot of fanfare, we’ll dive into some of the formats that exist as of now, but first let’s delineate a few traits and categories.
An audio file (or a video file for that matter) is either compressed or uncompressed. What this means is the file is either whole and complete or it has been squashed down to save space, like using a .zip file; or in physical terms, like using one of those infomercial vacuum bags to suck the air out of your Christmas sweaters. A WAV file is uncompressed; an MP3 is compressed.
Don’t confuse compression or the lack thereof with the terms lossy or lossless. Lossy and lossless are two types of compressed files. If a file is lossy, it means some data has been thrown out because in theory that data isn’t necessary, usually because the human ear can’t hear it. That data cannot be recovered. On the other hand, a lossless file is compressed, but no data has been thrown out. Think of the difference between cutting off the sleeves of your sweater (because it’d be fine as a vest) and sucking it in Mr. Popeil’s vacuum (lossy), and simply sucking it in the vacuum, but leaving it intact (lossless). As you might guess, lossless files are generally bigger. MP3s are lossy. FLAC files are lossless.
File Format and Codec
You may never need to know this, but there is a difference between a file’s format and it’s codec. The format, or file type, is simply the wrapper in which the audio data is kept. The codec is the meat of how it’s encoded. Not all file types support all codecs, but there are some surprising possibilities. A WAV file might not be encoded with PCM, for example. We don’t have room here for a comprehensive list, but it’s likely you’ll only ever need to worry about a few possibilities. We’ll say more on those big ones momentarily.
Sample Rate, Bit Depth and Bit Rate
These are the main measurement of audio quality, and there can be some confusion about what they all mean.
Sample rate is used to refer to an original or uncompressed recording. It’s how many times per second a snapshot of the signal is taken. 44.1k means 44.1 kilohertz, or 44,100 times in a second. You probably know that CD quality is 44.1k, 16 bit.
Bit Depth is how many bits are in each sample. If you record at 44.1k, 16 bit, you’re taking 44,100 16 bit samples every second. Crudely, more bit depth corresponds to more dynamic range.
Bit Rate can be a bit fuzzier. Bit rate simply means the number of bits that are processed over a given amount of time, and it is a measure that can be applied to any file. A CD quality file is 1,411 kbps (kilobits per second), for example. In practice, though, bitrate is more often used to refer to the quality of a compressed, lossy file. To be crude again, it comes down to a measure of how much data we’ve thrown away. The highest bit rate for mp3s is 320 kbps, and the default iTunes rate is 256. A 128k MP3 is noticeably smaller than a 320k file, but in many situations, not all that different sounding. A 32k MP3, however, would sound awful, except in special circumstances (audiobooks, for example, often use low bit rates, because that doesn’t much affect a spoken track).
The Big Ones
While there are actually tons of audio file types and different combinations of format/codec possibilities, there are only a few you’re likely to see very often. In fact, we can narrow that down to three. WAV, AIFF, and MP3.
WAV (Waveform Audio File Format) files are Microsoft’s format, used in PC applications, and based on RIFF (resource interchange file format). Usually WAV files are encoded using PCM (pulse code modulation) encoding, which is uncompressed and the same basic encoding used in CDs, but it is possible to encode a WAV file with other codecs, even compressed ones. A “RIFF Wav” is a normal WAV file, and a “Broadcast WAV” is a WAV file with extended headers, originally used by broadcasters. WAV files have .wav extensions.
AIFF (Audio Interchange File Format) files are Apple’s uncompressed format, also based on RIFF, and usually using PCM encoding. The only practical difference between WAV and AIFF files is that AIFF files allow more metadata by default (so you can see stuff like album covers in iTunes), but you will notice that certain DAWs won’t deal with both. That’s not a problem, as you can easily convert between them with something like Sox or FFMPEG, or free software like Audacity. AIFF files typically carry .aif extensions.
EDIT with a sneak pro tip: AIFFs and WAVs are literally the same format, from an audio standpoint. So if Joe Schmo who uses GarageBand sends you a bunch of AIFFs that your Windows DAW can’t read – you can just change the extension and voila.
MP3 (MPEG Audio Layer III) files are compressed, lossy and very common. MP3 shouldn’t be confused with MPEG-3, which is a video format. MP3 compression is done by throwing away data which isn’t needed, mostly due to a phenomenon in human hearing called auditory masking. That’s a pretty fancy way of saying we don’t hear everything in an uncompressed file anyway, so we might as well throw some away to save space. There’s no shortage of debate there, but it seems to work pretty well. MP3 was a proprietary format, owned and licensed by the The Fraunhofer Institute of Integrated Circuits, and that’s why not all software could make an MP3, at least until very recently. The Fraunhofer Institute declared MP3 an obsolete format in May of 2017, and terminated its licensing program. Whether this means the MP3 will die or proliferate further remains to be seen. For now, it’s still the de-facto compressed file format, and typically what you get when you rip a CD with iTunes or other software, or download that free track from your favorite polka band.
Other Major Formats
There are so many audio formats, we’d be hard pressed to talk about them all here, but there are a few you should know about.
CDDA (Compact Disc Digital Audio) is the format for compact discs. It’s just an AIFF file with different headers. If you happen across a .cdda file (probably ripped from a CD), you’ll probably be able to play it in anything that can play a WAV or AIFF.
AAC (Advanced Audio Coding) is a compressed, lossy format created by Dolby which was designed to be a successor to MP3. Apple subsequently developed a copy protected version that uses DRM (digital rights management) for iTunes, and that’s generally the format of files you buy from iTunes.
FLAC (Free Lossless Audio Codec) is exactly what it sounds like, a free, lossless, compressed format. Great for archiving files, since it can reduce size up to 60% without losing any quality.
WMA (Windows Media Audio) was originally a compressed, lossy Windows format designed to compete with MP3. It’s been expanded to include a lossless version, a multichannel version, and a lower bit rate version used for voice. You may encounter Windows system files or other similar things in WMA format. WMA files can be copy protected.
AC-3 is a lossy 5.1 surround sound format used by Dolby Digital in DVDs, HDTV and DTV (digital television). Its highest sample rate is 48k. A side note: The “point one” in surround sound refers to a Low Frequency Effect (LFE) channel which has less bandwidth. The LFE is where the shake your boots BOOM in movies comes from.
What To Use?
At this point your question may be why should I care, or what should I use? The truth is, audio is audio, and when it comes to format choice, utility is the main consideration. Your DAW will do what it does, and I recommended letting it do that. When you’re deciding what to export, think about the use at hand. You’ll want to export either WAV or AIFF for mastering, making CDs, importing into a video project, or other continuing full resolution work. They’re really the same thing, so think about the software you’re using next, or what the person on the other end needs, and use that.
When it comes to delivery to the general public, think about the end user rather than entering into an endless debate about the perceptual quality of various algorhythms or codecs. If you’re selling downloads to normal people, you’ll probably want to use MP3s. If you’re delivering files to a digital distributor, you’ll probably be asked for CD quality WAVs, and in some cases, distributors will take AAC files for iTunes. If you want you can also distribute lossless files in FLAC format, or give people access to WAVs, or even distribute OGG/Vorbis files, which is an open source container/codec combination very similar to MP3. Beware, though, that not all players support these less common formats, and your user may end up with no way to listen.
As far as bit rate, I like to give my loving, devoted fans the highest quality MP3s I can, so those are encoded at 320k, but it’s also a good idea to make a 128k version for web-based preview listeners, because the smaller size will load faster and stream better. Some submissions you make (say to internet radio or licensing folk) may have size limits, too, so those smaller MP3s are useful. In the end, this is a judgement call, and if it’s for your own personal listening, then do whatever you like best.
One other consideration is something we addressed in “Keeping Track”, which is metadata. There are many situations where you’ll want some data other than audio in your file. Whether it’s so consumers know who you are, or licensing agents know who to contact, you’ll need some extra info in there, so the file type you use to send to certain people needs to contain that data. That’s what we covered in “Keeping Track”, so if you haven’t seen it, check that article out.
As with any very technical topic, an exploration of audio file types can go quite deep, and we don’t have room here to cover everything we could think of, so here are some recommendations for further reading:
- Principles of Digital Audio by Ken Pohlmann
- The Audio Expert (chapter 8 especially) by Ethan Winer
- Mastering Audio by Bob Katz
- How Music Got Free by Stephen Witt (for a great history of the MP3 format)
- Any Wikipedia page about “audio file types” or specific types – google “WAV Wikipedia”, for example.
If you’re new to audio or recording, then hopefully we’ve helped you at least begin to sort out file types in digital audio, and if you’re a veteran, I hope you’ve reminded yourself of a few things here. For the most part, file types are pretty straight forward, but you can run into confusion at times, especially when a DAW or other piece of software gives you a thousand choices. It’s nice to remember a few basic tenets, cut through the noise, and get back to creating. So file this away, and we’ll see you in the studio!
Did you know I have a master’s degree in “Music, Science and Technology” from Stanford University? That means I can go back and forth between Macs and PCs in the studio, and talk at length about debt. Find me on Facebook and Twitter and other various stuff @AaronJTrumm.