17 September 2011

Do you have a Praat script that can...

As I have mentioned before, Praat scripts can take care of many repetitive actions for you, saving you many mouse clicks. But be warned, a script cannot do anything that requires human judgment, such as labeling segments. I cannot stress this enough: you have to know exactly what the script is doing.

When I first started using Praat, I used many small scripts, each of which performed one small job. For instance, after a recording session, I would use one script to cut the large file of the entire recording session into smaller files, a separate script to extract measurements, another script to normalise the intensity of the sounds, and yet another script to ramp in and out, and so on. What I found was that the problem with such an approach is that it is very easy to lose track of what manipulations you have already done to a set of sound files — especially if you are working on multiple projects at the same time, with different normalisation scripts, etc. As a result, a few years ago, I changed the way that I used Praat scripts.

Now, for each project, I create one large Praat script that contains all manipulations: taking original measurements, manipulating intensity, pitch, ramping, etc., renaming files, and taking new measurements after all manipulations have been completed. This way, when the time comes to write up the Method section, all of the information is easily accessible and traceable.

Due to this new project-based scripting technique, I create a new script for each project that I am working on. However, I rarely code a script from scratch. Bits and pieces can be reused from other scripts, and in this way new scripts can be generated very quickly. Below, I have listed all of the Praat scripts that I have in my possession. I receive frequent requests along the lines of, "Do you have a script that does x?"

Every Praat script that I have is listed below, along with a brief description. I have omitted some scripts that are very similar to those posted here, or some of the larger scripts that have been tailored to a particular set of stimuli. Each script may be downloaded by clicking on the name. Please feel free to download, modify, and redistribute the scripts without restriction. What's mine is yours:

text grid maker.praat My first ever Praat script. Very useful for creating textgrids once a recording has been completed. Saves you many mouseclicks and RSI.

text grid reviewer.praat Once you (or a research assistant) have created textgrids, it can be useful to open each sound file and its corresponding textgrid to review the boudnaries and annotations that have been created. This script makes the reviewing process easy. It allows you to make edits and saves the changes before moving to the next textgrid.

duration logger.praat Useful for measuring the length of intervals marked in a textgrid. Bits of this script show up in all of my larger scripts that have the word "measure" in their filename.

wav aif converter.praat A simple, but very useful script for those who wish to share soundfiles between Macs and PCs. Wav files can be easily converted to aiff and vice versa without altering the file's quality.

get measurements.praat Similar to duration logger.praat except that this script outputs a range of measurements: formants, f0, amplitude at 25, 50 and 75% of each vowel.

load all sound files.praat A very simple, but very useful script. It loads all soundfiles in a directory into Praat. Praat annoyingly does not function like most Windows/OSX apps. You cannot drag and drop a file into Praat. Opening many files is a bit of a chore. Newer versions of Praat at least let you open multiple files at once. This script makes things even easier. I always include a version of this script when asking others to listen to a bunch of sound files for me. It just makes life easier. highly recommended.

sound chainer.praat Combines all files in a directory into a single file, adding 1 second of silence between each sound.

on off ramp.praat Quick and easy way to ramp many files that have already been cut.

chop from point.praat Reads in all sounds and textgrids in a directory, identifies which parts of the original sound have been labelled, extracts the labelled sound, ramps the beginning and end of the sound, outputs newly cut sounds, and preserves textgrids.

ramp, chop - preserve textgrids.praat One of my first 'larger' scripts. As the name implies, it excises individual tokens from a larger file, ramps the onset and offset to zero dB, and preserves the textgrid information for the excised segment.

norm, ramp, chop - preserve textgrids.praat Similar to the script above, except that it normalises each token as well as ramp in and out.

scale intensity (energy) with output.praat Reads in soundfiles, takes intensity measurements, scales each file to a specified dB value, takes new intensity measurements. An error message is displayed if any sounds clip due to scaling.

scale peak steps.praat Outputs a variety of acoustic measurements. Calculates the difference between the intensity of the vowel and a prespecified intensity (dB_target). Scales the entire soundfile so that the vowel's intensity will now equal dB_target. Outputs a new set of acoustic measurements, this time for the scaled stimuli.

pitch fixer.praat Sets the pitch of each token to a predetermined (flat) value.

sound mixer, step maker.praat Generates a continuum of x steps between two soundfiles that serve as  endpoints. The continuum is created by mixing the two sounds together.

create Klatt sound.praat Synthesises the syllable /da/ using Klatt synthesis.

Michael_Klatt_cont_plusplot_2d.praat This impressive script was developed by one of my previous supervisors, Michael Tyler. The script takes as its input two vowels defined in F1/F2 space. One is an anchor and the other a starting point. The script will generate a continuum of vowels at a given distance apart (in erbs) at a fixed radius from the anchor vowel. The user can choose this to be the distance between the anchor vowel and the starting point or the user provides a radius in erbs.  The user specifies whether the direction is anticlockwise or clockwise. Most people won't find much use for this, but it is very advanced stuff.

I will continue adding to the list as I develop new scripts, or come across scripts that I consider to be useful.

09 September 2011

Brand new look to celebrate Mark's Speechblog's 4th birthday

My Speechblog turned 4 on September 9, and to celebrate, I have created an all new blog template. I am using a heavily modified version of the 'Ethereal' Blogger template. The text area of each post has been widened to take advantage of the widescreen aspect ratio of most monitors, tablets and smart phones. The header logo has also been updated. I'm a lot better at manipulating images now than I was when I created the original speech signal header in 2007. For those interested in speechy things, the new waveform is of a Spanish female speaker producing a trill in /aCa/ context - only the first vowel and part of the trill are visible in the header. A bit of trivia for you: the old waveform was of a Ma'di voiced implosive stop.

01 July 2011

Create high resolution 300dpi images for journal publication... without PowerPoint

I have previously outlined how to create journal quality 300dpi .tif images in PowerPoint. This worked very well, and has helped a lot of people. However, this only works in PowerPoint 2003 and 2007.

The Problem
Users of PowerPoint 2010, as well as users of PowerPoint on OSX, are unable to join in on the 300 dpi goodness. For instance, in PowerPoint 2010, there is an option to scale images to 220 dpi, but that's not the same as 300 dpi, is it? In OSX, this functionality was included in PowerPoint 2008, but was removed altogether in 2011!

Here is a quote from someone working at Microsoft Support,
"Unfortunately, the "Dots per inch" option is no longer available in Office for Mac. If this is a feature you'd like to see in future versions of Office for Mac, be sure to send your feedback by clicking..." blah.
See here: http://www.officeformac.com/ms/.59bcff97/0
So, it looks like from now on in order to create high resolution images out of your Excel figures it will be necessary to use something other than PowerPoint.

The Solution
I propose a solution similar to what I have recommended for creating high quality conference posters. It requires that you have a PDF printer installed (below I use the example of Acrobat, but the concept applies equally well to other PDF printers).

Step 1: Create your graph in Excel. I recommend leaving the axes unlabeled for the time being to avoid distorting the text when resizing the figure.

Step 2: Print the image. With the image selected, select File | Print and then set the Printer to Adobe PDF (or whatever PDf printer you have) and select Printer Properties.

Step 3: Edit Image Settings. Edit the Images setting so that the PDF creation process will preserve the image in high quality without downsampling or compressing it. Click OK and optionally save your settings for next time. Print your figure to PDF.

Step 4: Extract TIFF image from PDF. Open your newly created PDF in your viewer of choice, and save 
the page as a TIFF, setting the resolution to 300 dpi.

Step 5. Add finishing touches to TIFF in image editor. Open the TIFF file in your image editor of choice (I use Paint.NET). Add labels to the axes and legend (tip: for the x -axis, it is easy to keep the labels aligned in vertical space by using the text tool only once and separating the labels using the space key to position the labels along the x-axis). Crop the image and save. The 300 dpi resolution will be preserved in your newly cropped TIF.

07 May 2011

An ideal speech recording setup

Having moved to a new lab, I was faced with a few decisions re: setting up a new speech recording space. This article catalogues each piece of equipment and why it was chosen.

Equipment: Shure SM10A headset cardioid microphone, Roland UA-25EX USB audio interface, laptop computer, recording software (Windows: Cool Edit, OSX: SoundStudio).


I should point out that these decisions reflect my personal choice. My recommendations are based on the following assumptions. The average participant is a speech recording experiment:

  • has a poor understanding of what they are doing with their voice
  • does not know how sensitive a microphone is
  • tends to speak either too softly or too loud
  • starts doing strange things when you give them too many instructions
  • has to be tricked to produce speech in the way that the experimenter desires
  • will get tired as the sessions goes on
Having spelled out these assumptions, my aim in setting up a speech recording space is to get the participant in and out with as little fuss as possible, get recordings that are clean and unaltered, and minimise the chances of a recording going bad due to equipment or setup.

Why a headset mic?
A headset mic allows the talker the freedom to move their head without changing the distance between the microphone and their mouth. Depending on the model, the mic can be positioned as close as possible to the talker's mouth so that very subtle phenomena, such as prevoicing, maybe captured. The downside is that other noises will also be captured as well, such as spit bubbles, tongue smacks, the participant scratching their nose and so forth. This can be somewhat alleviated by selecting a headset mic with a cardioid pattern of sensitivity and a restricted receptive field. The Shure SM10A fits the bill nicely, making it an ideal mic for recording both in the soundbooth, as well as in the field. It is also relatively inexpensive at around $150.

Why an external audio interface?

In this case, the decision was easy: we needed a way to interface the microphone, which has an XLA connector with the computer, which does not. But, apart from this practical issue, there is another important reason to use an external audio interface or soundcard, and that is to improve quality and reduce interference.

The internal soundcard that is connected to the computer's motherboard is surrounded by many noisy and busy circuits that are working all the time. By using an external sound card, the analogue-to-digital conversion is performed outside of the computer, and what is fed into the computer is a series of 0s and 1s, meaning that the signal cannot degrade. In addition, external audio interfaces are designed for this sole purpose, their drivers are constantly being improved (so upgrade frequently), and this all results in getting the best-sounding recordings into your computer.


So, which audio interface to use then? Well, for speech recording, it does not really matter whether the interface has USB 1.1, USB 2.0, or Firewire connections. You will be fine with any of the above. What does matter is the quality of the driver (the software that tells the sound card what to do and when to do it). In my particular case, I chose the
Roland UA-25 EX which is one of the most popular (if not the most popular) USB audio interfaces of all time. The reason I chose it is because I have used it in the past extensively, and have been very pleased with the results. In addition, it is inexpensive (around $US200) and works faultlessly on both Windows and OSX. Note that this particular model has now been discontinued and replaced by the Roland Quad Capture

What software to use?

When it comes to recording software, I just want it to stay out of my way and create sound files that are as untouched as possible—as if the speech is going straight from the microphone onto the hard disk. No effects, no plugins, no nothing. If I choose to manipulate the recordings in any way, I will do so later, by making conscious choices about what is manipulated, and all manipulations will be documented in detail.

These are my criteria for speech recording software:

  • crystal clear audio recording
  • real-time monitoring of the input level
  • a clipping indicator (usually a red light of some sort)
  • can output to wav and aiff formats
  • can record for two hours without becoming sluggish or slow
  • allows me to copy and paste large chunks of audio into new files without a long delay
For simple voice recording that stays true to the original source, without using any effects or plugins, you cannot beat Cool Edit Pro. Now discontinued, Cool Edit Pro was the king of audio editors in the early 2000s. It's makers were bought out by Adobe, and Cool Edit evolved into Audition, which is both too expensive and requires far too many resources for our purposes, namely simple voice recordings. 

---------------------------------

Begin rant
---------------------------------
Adobe Audition now occupies the space between audio editor and studio level sound production, making it a poor man's Pro Tools or a fat version of Cool Edit. I am not a fan. To address this issue of bloat, Adobe released a simpler program called SoundBooth, which I initially thought to be an updated version of Cool Edit, but after having tried the demo, all I can say is that it is a poor piece of software that fails to differentiate itself from Audition in any meaningful way, and I cannot imagine who would want to use it.
---------------------------------
End rant
---------------------------------

Having moved to a MacBook Pro, I sought to find an OSX app that had the function and feel of Cool Edit.
SoundStudio 4 fits the bill nicely. Priced at a reasonable $50, SoundStudio has a simple interface that is both intuitive and efficient. For OSX recording, I recommend it.