07 May 2011

An ideal speech recording setup

Having moved to a new lab, I was faced with a few decisions re: setting up a new speech recording space. This article catalogues each piece of equipment and why it was chosen.

Equipment: Shure SM10A headset cardioid microphone, Roland UA-25EX USB audio interface, laptop computer, recording software (Windows: Cool Edit, OSX: SoundStudio).

I should point out that these decisions reflect my personal choice. My recommendations are based on the following assumptions. The average participant is a speech recording experiment:

  • has a poor understanding of what they are doing with their voice
  • does not know how sensitive a microphone is
  • tends to speak either too softly or too loud
  • starts doing strange things when you give them too many instructions
  • has to be tricked to produce speech in the way that the experimenter desires
  • will get tired as the sessions goes on
Having spelled out these assumptions, my aim in setting up a speech recording space is to get the participant in and out with as little fuss as possible, get recordings that are clean and unaltered, and minimise the chances of a recording going bad due to equipment or setup.

Why a headset mic?
A headset mic allows the talker the freedom to move their head without changing the distance between the microphone and their mouth. Depending on the model, the mic can be positioned as close as possible to the talker's mouth so that very subtle phenomena, such as prevoicing, maybe captured. The downside is that other noises will also be captured as well, such as spit bubbles, tongue smacks, the participant scratching their nose and so forth. This can be somewhat alleviated by selecting a headset mic with a cardioid pattern of sensitivity and a restricted receptive field. The Shure SM10A fits the bill nicely, making it an ideal mic for recording both in the soundbooth, as well as in the field. It is also relatively inexpensive at around $150.

Why an external audio interface?

In this case, the decision was easy: we needed a way to interface the microphone, which has an XLA connector with the computer, which does not. But, apart from this practical issue, there is another important reason to use an external audio interface or soundcard, and that is to improve quality and reduce interference.

The internal soundcard that is connected to the computer's motherboard is surrounded by many noisy and busy circuits that are working all the time. By using an external sound card, the analogue-to-digital conversion is performed outside of the computer, and what is fed into the computer is a series of 0s and 1s, meaning that the signal cannot degrade. In addition, external audio interfaces are designed for this sole purpose, their drivers are constantly being improved (so upgrade frequently), and this all results in getting the best-sounding recordings into your computer.

So, which audio interface to use then? Well, for speech recording, it does not really matter whether the interface has USB 1.1, USB 2.0, or Firewire connections. You will be fine with any of the above. What does matter is the quality of the driver (the software that tells the sound card what to do and when to do it). In my particular case, I chose the
Roland UA-25 EX which is one of the most popular (if not the most popular) USB audio interfaces of all time. The reason I chose it is because I have used it in the past extensively, and have been very pleased with the results. In addition, it is inexpensive (around $US200) and works faultlessly on both Windows and OSX. Note that this particular model has now been discontinued and replaced by the Roland Quad Capture

What software to use?

When it comes to recording software, I just want it to stay out of my way and create sound files that are as untouched as possible—as if the speech is going straight from the microphone onto the hard disk. No effects, no plugins, no nothing. If I choose to manipulate the recordings in any way, I will do so later, by making conscious choices about what is manipulated, and all manipulations will be documented in detail.

These are my criteria for speech recording software:

  • crystal clear audio recording
  • real-time monitoring of the input level
  • a clipping indicator (usually a red light of some sort)
  • can output to wav and aiff formats
  • can record for two hours without becoming sluggish or slow
  • allows me to copy and paste large chunks of audio into new files without a long delay
For simple voice recording that stays true to the original source, without using any effects or plugins, you cannot beat Cool Edit Pro. Now discontinued, Cool Edit Pro was the king of audio editors in the early 2000s. It's makers were bought out by Adobe, and Cool Edit evolved into Audition, which is both too expensive and requires far too many resources for our purposes, namely simple voice recordings. 


Begin rant
Adobe Audition now occupies the space between audio editor and studio level sound production, making it a poor man's Pro Tools or a fat version of Cool Edit. I am not a fan. To address this issue of bloat, Adobe released a simpler program called SoundBooth, which I initially thought to be an updated version of Cool Edit, but after having tried the demo, all I can say is that it is a poor piece of software that fails to differentiate itself from Audition in any meaningful way, and I cannot imagine who would want to use it.
End rant

Having moved to a MacBook Pro, I sought to find an OSX app that had the function and feel of Cool Edit.
SoundStudio 4 fits the bill nicely. Priced at a reasonable $50, SoundStudio has a simple interface that is both intuitive and efficient. For OSX recording, I recommend it.