From Facebook Code:
Text is a prevalent form of communication on Facebook. Understanding the various ways text is used on Facebook can help us improve people’s experiences with our products, whether we’re surfacing more of the content that people want to see or filtering out undesirable content like spam.
With this goal in mind, we built DeepText, a deep learning-based text understanding engine that can understand with near-human accuracy the textual content of several thousands posts per second, spanning more than 20 languages.
In traditional NLP [natural language processing] approaches, words are converted into a format that a computer algorithm can learn. The word “brother” might be assigned an integer ID such as 4598, while the word “bro” becomes another integer, like 986665. This representation requires each word to be seen with exact spellings in the training data to be understood.
With deep learning, we can instead use “word embeddings,” a mathematical concept that preserves the semantic relationship among words. So, when calculated properly, we can see that the word embeddings of “brother” and “bro” are close in space. This type of representation allows us to capture the deeper semantic meaning of words.
Often people post images or videos and also describe them using some related text. In many of those cases, understanding intent requires understanding both textual and visual content together. As an example, a friend may post a photo of his or her new baby with the text “Day 25.” The combination of the image and text makes it clear that the intent here is to share family news. We are working with Facebook’s visual content understanding teams to build new deep learning architectures that learn intent jointly from textual and visual inputs.
Read the Complete Post, View Demo Video
Hat Tip: SocialTimes/Adweek