Lesson 4

What is Digital Audio

Digital audio uses binary code, 0’s and 1’s, which contain the data that makes up a sound wave. These 0’s and 1’s are coded to represent the precise wave form that the computers software can interpret into a sound wave. Analog audio is turned into digital audio in the converter, this process is called AD, which stands for analog to digital. Digital audio is turned into analog also in the converter, and this process is called the DA, for digital to analog. So, together, they’re called the ad/da. Once the wave form has been converted to digital, it’s stored as samples and bits. Samples are the individual markers that make up the wave shape. An individual sample cannot represent any sound, you need multiple (like thousands) of samples to be able to connect the dots and form a wave.
Bit depth is the number of possible values each sample can be. We’ll go into more detail about sample rate and bit depth here.

As you remember from the previous lesson, analog sound in the form of electricity can be described as rapid fluctuations in voltage. In the AD process, the converter senses and takes continuous measurements of the exact voltage at the exact time of the measurement. It takes these individual measurements of the incoming wave at such a fast rate, it is possible to connect the dots to accurately re-create the wave form. 
These individual measurements are called samples, because the converter takes a sample of what the voltage is at that time. The speed that these samples are taken is called sample rate. Most recording interfaces let you choose what sample rate you want to record at. The most common, and fairly standard for audio is 44100 samples per second, which we refer to as a sample rate of 44.1 kHz. The standard for video is 48 kHz. Other common sample rates are double and quadruple of these, such as 88.2 kHz, 96 kHz, 176.4 KHz, and 192 kHz.
There are advantages and disadvantages of faster sample rates. The advantages are better sound quality. The more times per second it takes the samples, the more accurately the shape of the wave is reproduced, which in turn increases the precision and accuracy of the sound.
The disadvantage of higher sample rates is it uses more hard drive memory and bandwidth. 88.2 kHz, compared to 44.1 kHz, needs to stream twice as much information in the same amount of time, uses up twice as much hard disk space, and due to the limited write speed of the hard drive, can only record half as many tracks simultaneously.
And honestly, the improvement in sound quality is negligible. Many pro studios record at 44.1khz, because there is simply no need to go higher. I personally usually record at 88.2 kHz, because of the potentially slightly better audio quality, and I have plenty of hard disk space, and extremely fast solid state hard drives dedicated to the recording path. For more information about hard drive management, I have a video dedicated to this topic in the tips and tricks section.

Now let’s talk about bit depth. First things first, it’s not called bit rate. The pros will scream at you if you call it bit rate. It’s bit depth.
When the converter takes its readings of voltage, there are software limitations as to how accurate the readings can be. It can only read the voltage within a set number of possible values. In 16 bit audio, you can figure out how many possible values there are with some simple math. Just calculate 2 to the power of 16, which is 65,536. With 24 bit audio, 2 to the power of 24 is 16,777,216. That means, I’m 24 bit audio, each sample can be any one of the possible 16,777,216 values. Higher bit depth provides greater accuracy for the value of each sample, which results in better sound quality. For playback of an audio track, there is no noticeable difference between 16 bit and 24 bit. The real advantage is when layering multiple tracks, and also when doing any digital processing with software, higher bit depths can provide more accurate processing. This is the reason some software processes at 32 bit, 64 bit and 32 bit floating point. On a large multitrack project, higher bit depth processing actually does make a noticeable difference.
Now, we’ve explored bit depth and sample rate, but how does the converter keep the timing of the samples perfect? The samples are being taken at such a fast rate, timing consistency is crucial. This a is actually fairly difficult to achieve. The device that keeps the timing of the samples is called the word clock. Every adda has a word clock. Sometimes, like the da in an iPod, it’s just a simple chip circuit. High end studios pay thousands of dollars for word clock upgrades. Any audio interface will have word clock built in, and sometimes they provide a word clock output so that the word clock can be synchronized to another adda converter.

An audio interface will have a limited number of channels available for the ad/da conversion. Many interfaces will provide an expansion option to receive digital audio from another device that does the conversion, so the interface is capable of providing more channels to the daw, but doesn’t have to provide conversion for all of them.
The advantage of this is added flexibility in multi channel setups. Often, devices such as keyboards and preamps will offer digital outputs, which can be connected the digital inputs of the interface and not use up any of the precious analog inputs of the interface.

The most common forms of digital audio are AES/EBU, Adat, and Spdif.

Adat uses an optical cable to transmit the digital information. This single cable can transmit 8 channels of 24 bit audio at 44.1 or 48 kHz.
It’s important to remember, that 88.2 or 96 kHz contains twice as much information per channel, and therefore the Adat cable can only stream half as many channels. So, at 88.2 or 96 khz, an Adat cable can only stream 4 channels of audio, and at 176.4 or 192 kHz it can only stream 2 channels.

The AES/EBU format uses a single Xlr cable to transmit 2 channels of digital audio at any sample rate, and is capable of longer cable runs than ADAT. Often, a DB-25 connector will be used, which is a single connector that essentially houses 8 xlr cables. A single DB-25 connector is capable of transmitting 16 channels of digital audio at any sample rate. 
The AES/EBU protocol can also transmit word clock information along with the digital audio, which adds flexibility and convenience for syncing Word Clock.

SPDIF is similar to AES/EBU, however it uses an unbalanced cable with rca connectors. This digital audio format can also transmit 2 channels of digital audio at any sample rate, however, compared to AES/EBU, it is more susceptible to to interference and the cable length must be kept as short as possible.


In order to plot a wave, you need a x and y axis. In the case of audio, the x is your sample rate, and the y is your bit depth