A New Technical Report From Microsoft: "Toward Topic Search on the Web"
This new technical report was written by Yue Wang, Hongsong Li, Haixun Wang, and Kenny Zhu from Microsoft Research Asia.
From the Abstract:
Traditional web search engines treat queries as sequences of keywords and return web pages that contain those keywords as results. Such a mechanism is effective when the user knows exactly the right words that web pages use to describe the content they are looking for. However, it is less than satisfactory or even downright hopeless if the user asks for a concept or topic that has broader and sometimes ambiguous meanings. This is because keyword-based search engines index web pages by keywords and not by concepts or topics. In fact they do not understand the content of the web pages. In this paper, we present a framework that improves web search experiences through the use of a probabilistic knowledge base. The framework classifies web queries into different patterns according to the concepts and entities in addition to keywords contained in these queries. Then it produces answers by interpreting the queries with the help of the knowledge base. Our preliminary results showed that the new framework is capable of answering various types of topic-like queries with much higher user satisfaction, and is therefore a valuable addition to the traditional web search.
Probase is an ongoing project that focuses on knowledge acquisition and knowledge serving. Our primary goal is to enable machines to understand human behavior and human communication. We do this by injecting certain general knowledge or certain common sense into computing.
About Gary Price
Gary Price (firstname.lastname@example.org) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.