May 17, 2022

JSTOR “Early Journal Content” Articles Now Accessible via Internet Archive

During September 2011 JSTOR announced their Early Journal Content (EJC) program that makes available (free) nearly nearly 500,000 articles from more than 200 journals. (about 6% of the content) on JSTOR. All articles were published prior to 1923 in the United States and prior to 1870 elsewhere.

Today, a quick note to point out that 450,000 EJC articles are fully accessible via the Internet Archive. No registration is required and there are multiple tools available to read the material.

Brewster Kahle writes:

All 2 terabytes of the Early Journal Collection are available for bulk harvesting from the Internet Archive. Web search engines have been indexing the full-text contents of these materials already and, so far, people and robots have downloaded the articles over 400,000 times even before it has been announced. A data bundleincluding OCR text and metadata is also available from JSTOR’s Data for Research service for free downloading.

How to Access Early Journal Content via Internet Archive

Along with the bulk harvesting option that Kahle points out users can browse the collection online. Here are a few options.

1. By visiting the JSTOE Early Journal Content Collection Page on the Internet Archive
Browse subcollections.

2. By visiting this advanced search interface to search metadata of the entire collection.
You can also do this type of search for individual subcollections. Here’s an example.  

3. Use Google. Here’s one of many possible full text search strategy to limit to EJC material

Begin your search with this string: inurl:jstor intext:”Early Journal” [your keywords]

When you review results you’ll be viewing a text version of the article. In the left column, click the “see other formats” link to view online, download, etc.

4. Another option is use the JSTOR advanced search interface making search to limit to “only content I can access”. You can do this post search with the beta JSTOR interface. Limiting by date (to 1923) (to 1870) is another option that can help.

About Gary Price

Gary Price ( is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at, and is currently a contributing editor at Search Engine Land.