New Beta Release Allows Users to Keyword Search Some Material Found in The Wayback Machine
UPDATE October 26, 2016 It has been quite a week (so far) from the Internet Archive for new and improved search options.
On Monday, The Wayback Machine launched a keyword search beta. Our post about this new option is below this updated. We also shared news of new and enhanced search options from The Open Book Project, an Internet Archive initiative.
Today, news of MORE new search capabilities now available when searching the Internet Archive including using facets to focus search results and full text search (beta) of books. Details here.
Some VERY exciting news! Something we’ve all wanted for a long time.
The Internet Archive has just launched (beta release) the ability to keyword search a limited amount of material found in The Wayback Machine.
At this point you CANNOT keyword search specific words/phrases on specific pages.
This topic is discussed in a set of FAQs available here. We hope complete keyword search comes soon. Today’s launch is a terrific start to making Wayback even more useful.
So, what is available today? What can you search?
1. Keyword search a limited amount of Wayback Machine content, the homepages of more than 350 million sites. Fyi, the complete Wayback Machine contains over 510 billion pages.
2. Keyword search using word(s) that describe a site. For example, “Toronto Government” or websites related to “air traffic control”
3. You can limit your search to a specific domain or site by utilizing, site:. This filter/syntax can be combined with keywords. e.g. pages from MIT pages related to economics.
4. Results appear as you type. Impressive and no hiccups/stutter when I ran several searches. Impressive!
4. Clicking on any result will take you to a traditional Wayback Machine results page with links to archived copies of the page, PDF. etc.
4. The beta index is multilingual.
5. Cool! You can search the index using unicode characters!
Much more in the complete blog post, FAQs, and this discussion about how the Internet Archive defines web pages, web sites, and web captures.
Direct to Wayback Machine’s NEW Keyword Search Beta: web-beta.archive.org
Direct to Wayback Machine (Complete Index, Not Keyword Searchable) web.archive.org
Note: Archived Pages Made Available in Archive-It Public Collections Have Always Been Keyword Searchable
Note: How to Instantly Archive Web Pages and PDFs Using Wayback
About Gary Price
Gary Price (email@example.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com. Gary is also the co-founder of infoDJ an innovation research consultancy supporting corporate product and business model teams with just-in-time fact and insight finding.