Data Visualization: “How to Turn 175 Years of Words in Scientific American into an Image” + Interactive Resource
From Scientific American:
Summarizing the history of a 175-year-old magazine—that’s 5,107 editions with 199,694 pages containing 110,292,327 words!—into a series of graphics was a daunting assignment. When the hard drive with 64 gigabytes of .pdf files arrived at my home in Germany, I was curious to dig in but also a bit scared: as a data-visualization consultant with a background in cognitive science, I am well aware that the nuance of language and its semantic contents can only be approximated with computational methods.
[Clip]
A central question in any data-science project is how wide a net one casts on the data set. If the net is too coarse, all the interesting little fish might escape. Yet if it is too fine, one can end up with a lot of debris, and too much detail can obscure the big picture. Can we find a simple but interesting and truthful way to distill a wealth of data into a digestible form? The editors and I explored many concept ideas: looking at sentence lengths, the first occurrences of specific words, changes in interpunctuation styles (would there be a rise of question marks?), and mentions of persons and places. Would any of these approaches be supported by the available data?
Learn More, Read the Complete Article
Direct to Interactive Data Visualization
Search a 4,000-word database to see how language in the magazine evolved over time.
Filed under: Data Files, News
About Gary Price
Gary Price (gprice@gmail.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com.