Temi: audio to text

Cool factor 5/5
Usability 3/5
Value for money 4/5

Machine or auto-transcription has been a bit of a unicorn for anyone who has had to deal with long voice recordings, including journalists, board secretaries, lawyers and more.

Transcription is time consuming, painstaking and — for me, at least — straight up painful. Anecdotally, people talk about transcription taking at least three times as long as the recorded conversation — if you’re a snappy typist.

There are, of course, professional services that will do transcription, but they cost a pretty penny, starting at about R10/audio minute for bulk.

Attempts to produce tech solutions for this issue have had mixed success. Dragon Dictation software reportedly works quite well, but you must train it to your voice, so it’s no good if you want to transcribe a conversation, meeting or interview.

There are others that offer machine transcription (based on voice recognition) as a cloud service, and I’ve tried several, but none has come closer to the dream of fully automated transcription than Temi.

A Facebook post from journalist Gus Silber first put Temi on my radar a few weeks ago, and I’ve been putting it through its paces since then.

Temi costs just US10c/audio minute, so 100 minutes of audio costs $10, or about R120. Simply upload the files, and Temi will produce a quote based on file minutes. A quick credit card payment later, your files will be in the works, and you’ll get an e-mail when your order is complete.

You can upload multiple files at once, and most of the common audio and video file formats work (including MP3, MP4, WAV, M4A, WMA, MOV and AVI).

Then, in the edit screen (accessible via the dashboard), you can make corrections to the transcript while listening to the audio.

As with all such tools, they are only as good as what you put into them. Interviews that are recorded in person, with people with clear, fairly neutral-accented speaking voices have been about 80% accurate — though Temi isn’t great with names and South Africanisms. Complexities such as poor quality recordings, poor telephone signal, more than two speakers, or strong accents reduce the accuracy.

Temi currently only supports recordings in English.