Confederation of Open Access Repositories (COAR) Announces First Recommendation for Supporting Multilingual and Non-English Content in Repositories
From a COAR Announcement:
While the dominant position of a lingua franca – English – is useful for the widespread dissemination of ideas across the world, it also impedes the use of research results at the local level. And after decades of policies that have directed researchers to publish in English, we are beginning to see a reversal of this trend. The UNESCO Recommendation on Open Science, for example, calls on member states to encourage “multilingualism in the practice of science, in scientific publications and in academic communications”. In China, Europe, and other jurisdictions, policy makers are introducing new measures that encourage researchers to publish in local languages.
In August 2022, COAR launched the COAR Task Force on Supporting Multilingualism and non-English Content in Repositories to develop and promote good practices for repositories in managing multilingual and non-English content. The task force is focusing on identifying good practices for metadata, multilingual keywords, user interfaces, translations, formats, licenses, and indexing that will improve the visibility of multilingual and non-English content across the world.
The COAR Task Force is pleased to announce its initial recommendation towards improving the discovery of repository content in a variety of languages.
All records in the repository should include a tag in the language metadata field that identifies the language of the resource, and a tag that identifies the language of the metadata (even if the resources are in English).
Why? This is a very simple, but extremely powerful recommendation. When the language of the metadata and the language of the resource are correctly attributed, this allows discovery and indexing services to properly process and parse the text. Indexing involves text analysis practices such as stemming, lemmatization (grouping together the inflected forms of a word so they can be analysed as a single item), and the appropriate treatment of stop-words, all of which are language specific. Including the language tag enables information seekers, aggregators, and other discovery services to correctly identify the language of the metadata and full text and treat items accordingly.
See more information and our implementation guidance on the Multilingual and Non-English Content webpage
About Gary Price
Gary Price (firstname.lastname@example.org) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com. Gary is also the co-founder of infoDJ an innovation research consultancy supporting corporate product and business model teams with just-in-time fact and insight finding.