Via a JSTOR Tweet
From the JSTOR’s Data For Research Web Site:
We are happy to also make a data bundle for the Early Journal Content freely available to those who would like to conduct data mining or other research across the content.
The data bundle for EJC includes full-text OCR and article and title-level metadata. The Read Me file explains the data in more detail. The currently available data bundle includes all the EJC as of September 7, 2011.
Please note that use of the Early Journal Content bundle is subject to the Early Journal Content Specific Terms and Conditions of Use.
To access the data bundle, please create an account using the very brief registration form, or login if you already have a Data for Research account. We plan to update the bundle on a semi-regular basis and to alert registrants when the bundle has been updated.
The format of the data bundle is a .tar.gz archive containing a readme file explaining the format of the data files, and an XML file for each article in the Early Journal Content bundle.
Once logged in, you can download the Early Journal Content bundle here
The size of the bundle is approx. 2.3 GB compressed, and 7.2 GB inflated.