How do you find the utterance of a single phrase in 4+ hours’ worth of video. I gave up as I as trying to find that precise moment in the broadcast was a classic needle in a haystack problem. What if I could just search for utterances of the phrase “camera man tackle” and leap to that point in the recording. That’d be cool.
Well it’s not quite the Super Bowl but you can do exactly that as of today with videos from the U.S. Department of Energy (DOE). Scientific videos highlighting the most interesting R&D sponsored by DOE are now searchable thanks to a Microsoft Research project known as MAVIS. (Microsoft Research Audio Video Indexing System). A search for specific words results in direct links to the precise moment that word was uttered in a video. You can try out the DOE’s ScienceCinema today to see how this works against 1,000 hours of content.
There is some interesting tech at play in the background here. Rather than using Phonetic indexing, MAVIS uses Large-vocabulary continuous speech recognition or LVCSR with automatic vocabulary adaptation and special indexing techniques to improve the search.
"Microsoft Research’s Video Search Hits the DOE"
Filed by February 9, 2011on