August 15, 2018

Enhancing Multilingual Content in Wikipedia: Learn About the WikiBasha Tool From Microsoft Research

From a Microsoft Research News Article:

Making Wikipedia more multilingual inspired a Microsoft Research India team to develop a tool called WikiBhasha, which was launched Oct. 18. WikiBhasha—“Wiki,” signifying its community-oriented approach; “Bhasha,” a Sanskrit word meaning “language”—features a content-creation platform that combines linguistic services, such as machine translation, with a Wikipedia-friendly content editor. Everyday users in countries around the world, as well as language enthusiasts, can use WikiBhasha to adapt English-language Wikipedia articles for local languages. Along the way, they can create new local content to expand the article they have translated.

WikiBhasha users also can create new articles from scratch. And in time, the tool could help convert articles in languages other than English into local languages.

[Clip]

WikiBhasha also offers a way to sharpen the abilities of current machine translators. That, in fact, was the one of the driving forces for the idea behind WikiBhasha. It takes about 4 million sentence pairs, matched between two languages, to develop a machine translator robust enough to create effective translations. In many languages, collecting that much data is a nearly insurmountable task. But if a machine translator can at least start a translation using a smaller data set, then it’s possible for a wiki-style community to build on that and correct the machine translator—literally “teaching” the translator to be more effective.

Read the Complete Article

Gary Price About Gary Price

Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.

Share