Azure Media Services Introduces Speech Recognition Service for Indexing Video Content

Azure Media Services provides a high performance, cloud scalable video encoding, indexing and streaming platform for video producers.  It provides the ability for video producers to encode their video so that it is optimized and available for a wide variety of formats such as tablets, mobile phones, television stations, etc. 

Azure Media Services was used to broadcast all of NBC’s Olympics feeds online which is a pretty good scalability test given the audience for live event feeds.

Microsoft has announced the availability of a new speech recognition service that indexes and stores as searchable metadata all the speech found in the video.  The indexed text can then be used for close captioning, transcriptions and searching.

“Lots of banks are interested not only in storing data in the cloud but in how you recall it. You could say ‘tell me when I was talking to this customer about the price of gold’ and it will know where that part of the conversation was. Now we can analyze that data and make it searchable. The Financial Conduct Authority are quite interested in that for compliance; are the Chinese walls inside the bank working? And internal compliance departments are interested too; they’re looking at data mining audio calls and conversations.”

He suggests it will be even more useful it you connect it to other data sources and machine learning systems. “There are already automated trading systems that monitor Twitter,” he points out. “Now you could do monitoring inside the bank for sentiment too.”

 

The engine behind the indexing service is called MAVIS and has been in development for several years. 

Note that at this time MAVIS only supports indexing of English content.