Facebook Introduces DeepText, a “Deep Learning-Based Text Understanding Engine”
From Facebook Code:
Text is a prevalent form of communication on Facebook. Understanding the various ways text is used on Facebook can help us improve people’s experiences with our products, whether we’re surfacing more of the content that people want to see or filtering out undesirable content like spam.
With this goal in mind, we built DeepText, a deep learning-based text understanding engine that can understand with near-human accuracy the textual content of several thousands posts per second, spanning more than 20 languages.
In traditional NLP [natural language processing] approaches, words are converted into a format that a computer algorithm can learn. The word “brother” might be assigned an integer ID such as 4598, while the word “bro” becomes another integer, like 986665. This representation requires each word to be seen with exact spellings in the training data to be understood.
With deep learning, we can instead use “word embeddings,” a mathematical concept that preserves the semantic relationship among words. So, when calculated properly, we can see that the word embeddings of “brother” and “bro” are close in space. This type of representation allows us to capture the deeper semantic meaning of words.
Often people post images or videos and also describe them using some related text. In many of those cases, understanding intent requires understanding both textual and visual content together. As an example, a friend may post a photo of his or her new baby with the text “Day 25.” The combination of the image and text makes it clear that the intent here is to share family news. We are working with Facebook’s visual content understanding teams to build new deep learning architectures that learn intent jointly from textual and visual inputs.
About Gary Price
Gary Price (email@example.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com. Gary is also the co-founder of infoDJ an innovation research consultancy supporting corporate product and business model teams with just-in-time fact and insight finding.