Large quantities of data are flowing into archives each day: Newspapers and books are being digitised, whereas video material is being supplied directly in digital format. Search engine technology is therefore growing in importance. All of this digitised material provides a wealth of information for researchers in the humanities and social sciences, but can they also find what they are looking for amongst these so-called ‘big data’? According to Marc Bron, PhD student at the Intelligent Systems Lab Amsterdam (ISLA) at the University of Amsterdam, that depends on various factors. For certain material, researchers know that it is in the archive and which search terms they should use to retrieve it. However, in the majority of cases researchers come to the archive with a research question and they must first search for suitable material and explore the content of the archive.
One important difficulty in finding relevant material lies in the formulation of the search question that can be entered into the search engine. The search terms used by researchers can differ from the terminology archivists use to describe the material, even though they both mean more or less the same thing. For example, a researcher might enter the term ‘migrant’, whereas an archivist has used the term ‘foreigner’. The second problem arises if material is found. Researchers cannot establish whether or not they have collected all of the relevant material or if other interesting things can still be found that they are not yet aware of.
Smarter Searching in Archives Using Newly Developed Interface
Filed by September 3, 2012on