From Computer Business Review (CBR):
Enterprises with a large media archive often struggle with the challenge of transforming existing video archives into business value, particularly given the challenges of content discovery at scale: content categorisation is often flawed and manual tagging is expensive, error-prone and scales badly.
Microsoft thinks its upgraded product is the solution – and it’s a poster child for AI.
“Multi-modal topic inferencing” in Microsoft’s Video Indexer tool takes a tripartite approach to automating media categorisation: transcription (spoken words), OCR content (visual text), and facial recognition; operating under an innovative supervised deep learning-based model. It can even recognise moods
Oron Nir, a senior data scientist in Microsoft’s Media AI division said: “[This tool] orchestrates multiple AI models in a building block fashion to infer higher level concepts using robust and independent input signals from different sources.”
The technique is a step-change from Video Indexer’s previous keyword extraction model, which pulls out and categories only according to explicitly mentioned terms.
Read the Complete Article