Jump to content


Photo

After Video Capture - Audio to Text


  • Please log in to reply
3 replies to this topic

#1 Don Sommers

Don Sommers

    New

  • Basic Members
  • Pip
  • 9 posts
  • Other
  • Canada

Posted 17 May 2009 - 11:46 AM

I routinely capture a LOT of corporate "talking head" or interview based video and then have to splice it together.

Many times the video I shoot can end up being up to 1+ hours of different people that I have to condense down to a 5 minute video.

Over the years, I have either watched all the snippets and hand-written what people have said .. or ... watched the snippets and typed out what people have said .. or ... watched the snippets and then dragged and dropped them into the timeline and edited.

Using the text method allows me to quickly go through and highlight "what I want" and then pull the related snippets into the edit. I can also "search" for phrases and words after I've done this which is also helpful. Lastly, I can fire off the typed text to other individuals involved to get their input as well.

Dragging clips into the timeline is great, but it precludes that you have the individual who is in charge of the project there with you .. and you can spend a lot of time in the day. These people are not available for this (in my case) so most times, it's not an option. That's why the text version, rough edits and .wmv or .flv output files for final approval are what I normally (90%) end up using.

I was hoping that when I took Premiere Pro CS4 for a run, that it's new "extract audio" feature would be revolutionary for me. It has been a great dissappointment instead in that it has been nothing short of lame in "desciphering" audio to text.

<b>My Question:</b>
Other than Dragon Naturally Speaking .... does anyone have any tips, suggestions, ideas or magic potions that would "decently" extract text from audio files. Project by project, typing out over 1 hour's worth of talking is killing my productivity.

P.S. - due to the economy ... assistants (to extract the audio and do the typing) are not an option .. :(

Thanks in advance,...
  • 0

#2 Adam Garner

Adam Garner
  • Basic Members
  • PipPip
  • 95 posts
  • Other
  • Austin, TX

Posted 17 May 2009 - 01:08 PM

I ran into the same problem. Friends of mine routinely hire college students to transcribe for beer money, basically. Though, on many projects I don't trust anyone but myself with transcription since I have to edit it, you know?

Here's a trick I use:

If you have Pages (mac), you can import a quicktime file.

I clean up the interview raw footage and export the audio with no breaks/pauses etc, and import the aiff file to a pages doc. This allows me to hit play/stop/rewind within the doc while I type. It means I don't have to click more than once to start typing or playing. When you click play, the aiff plays... but when you click on the text to transcribe, the aiff stops. Then, you click play to start up again. It's 1 click. Nice.

Another option you could try (it just occured to me), is to play the audio in quicktime and use command+k to bring up the playback controls. Play it at 1/2 speed and I'll bet you could keep up typing. It keeps the pitch the same so it doesn't sound like andre the giant.

Edited by adam garner, 17 May 2009 - 01:08 PM.

  • 0

#3 Don Sommers

Don Sommers

    New

  • Basic Members
  • Pip
  • 9 posts
  • Other
  • Canada

Posted 17 May 2009 - 02:14 PM

Another option you could try (it just occured to me), is to play the audio in quicktime and use command+k to bring up the playback controls. Play it at 1/2 speed and I'll bet you could keep up typing. It keeps the pitch the same so it doesn't sound like andre the giant.


That's indeed an interesting idea. I am PC/Linux (but also have access to a Mac ...but don't use it for video editing at the moment).

I do have audio software that can slow down audio files (without changing the pitch). It's still kinda brutal because the time involved will STILL take more than the total capture time of the video(s) ... but it will help with the typing.

As you mentioned about beer money... in the situations that I have, I can't use that route ... :(

Getting this working so that you COULD transcibe to text would be a HUGE plus. Like I mentioned, I was extremely disappointed in the Premiere Pro CS4 audio transcription ... I don't know HOW they got it to work in the demo ;) . My sound is crystal clear and the transcript still looks like something of a cross between English and a foreigner's first attempt at english (no offense to anyone)....
  • 0

#4 Karel Bata

Karel Bata
  • Basic Members
  • PipPipPipPip
  • 487 posts
  • Director
  • London - a rather posh bit

Posted 18 May 2009 - 06:43 AM

It's strange that Apple had limited voice recognition capability in their OSs in Macs nearly 20 years ago, yet there's not been a hell of a lot of improvement since. But it's coming. Amazing things are now being done with audio - spectral editing is virtually standard, and the latest Melodyne can break down a piano or guitar recording into its constituent elements to allow for tweaking individual notes. Anyway...

You have to 'train' audio to text software. Which shouldn't be surprising since humans can be thrown by a simple change in accent. So my suggestion is:

Set up a separate recording device in your edit suite.
Play back the audio in snippets of, say, 10 seconds.
Each time record yourself saying what was just said.
'Train' your software to recognize your voice.
Put your fresh recording of your voice through.
Voila!

Or if you ever want subtitles:

Do as above, but...
Set up to have your fresh recording going on to an additional soundtrack in the video (and enable recording on that audio track only).
Put your playback head where you want to start.
Play back a section. Stop.
Solo the new track. Hit record. Speak that section yourself.
Move playback head to next section and repeat the above.
(This is how I'd do it in ProTools - it may be slightly different in Premiere)

You'll end up with a recording of your voice on one track repeating the audio in time with the video, which will save you effort later.

My girlfriend uses an online transcription service. She uploads her audio on to their site, and they send her a doc file and a bill. Legibility qualifies for a reduced rate. Eventually someone in China will be doing this for the price of a bacon sandwich... :(

p.s. I use Premiere a lot, but now the audio tools are such rubbish. And there's no way to get a soft wipe. :angry: I love soft wipes. And yes, I know how that sounds...
  • 0


Tai Audio

rebotnix Technologies

Metropolis Post

Willys Widgets

Glidecam

Visual Products

CineLab

Rig Wheels Passport

Wooden Camera

Abel Cine

Ritter Battery

The Slider

Media Blackout - Custom Cables and AKS

Paralinx LLC

Gamma Ray Digital Inc

Broadcast Solutions Inc

Technodolly

FJS International, LLC

Aerial Filmworks

CineTape

Opal

Aerial Filmworks

Visual Products

rebotnix Technologies

Broadcast Solutions Inc

Ritter Battery

Glidecam

Tai Audio

FJS International, LLC

Paralinx LLC

Abel Cine

Gamma Ray Digital Inc

CineLab

The Slider

Metropolis Post

CineTape

Willys Widgets

Rig Wheels Passport

Opal

Wooden Camera

Technodolly

Media Blackout - Custom Cables and AKS