YouTube has introduced its voice-to-text technology to enhance the search function, and to provide rich metadata for select videos. The technology makes the video headlines less important for searching now that any searched keyword can be found by searching the entire audio file instead.
It works by using a special type of automated spiders that search the audio file of each video to transcribe the spoken words into text. It then applies that text to the video in the form of rich metadata that can then be searched like normal.
According to Beet.tv, the technology has been applied to campaign videos from McCain and Obama, and now allow users to search for certain topics such as “Iraq,” “gas prices,” etc. and be taken to a video from each person where the search term occurs. It will even move to the exact point in the video where the searched keyword appears. It’s actually a perfect situation to showcase the new technology as campaign videos are extremely popular these days. As an added benefit, you can even move the timeline cursor at the bottom of the video to see the text displayed along the way. Pretty cool.
A change in the way videos are indexed and searched has been needed for a long time. With the vast amount of videos being uploaded these days, and the limited amount of raw-data in the form of “metadata” that’s associated with each video, it’s always been a struggle to nail down videos that you want to see, but can’t be found simply because the title didn’t include that specific keyword. Voice-to-text provides the perfect solution, if the transcription is accurate that is. It may take a while to perfect the accuracy, but at least they’re on the right track.
Companies like Blinkx, EveryZing and Delve are already providing voice-to-text transcriptions in the form of metadata, so it’s really not a new idea, but since YouTube is the unofficial king of user-generated video content, it’s a welcomed improvement.