With the scheduled April 1, 2022, release of 1950 Census records a little more than three months away, the National Archives is completing efforts to digitize those records and using technology to make them more accessible than ever.
[Clip]
The new website will include a name search function powered by an Artificial Intelligence/Machine Learning (AI/ML) and Optical Character Recognition (OCR) technology tool. This is important for genealogists and other researchers who rely on census records for new information about the nation’s past.
“The OCR being used to transcribe the handwritten names from the census rolls is about as good as the human eye,” said Project Management Director Rodney Payne. “Some of the pages are legible, and others are difficult to decipher. So, the National Archives developed a transcription tool to enable users to submit name updates. This will allow other users to find specific names more easily, and it provides an opportunity for the public to help the agency share these records with the world.”
National Archives officials are encouraging interested members of the public to use the transcription tool and assist the agency to make the records as accurate as possible.
[Clip]
The National Archives is also working to provide bulk download access of the full 1950 Census dataset on launch day. This will be of interest to digital humanists, web developers, social scientists, and anyone wanting to explore aggregations of the records. Other organizations and companies will be able to use this functionality to provide 1950 Census data on their own websites.
When made available on the Amazon Web Services Registry of Open Data, the 1950 Census dataset—over 165 terabytes of data—will include the metadata index, the population schedules, the enumeration district maps, and the enumeration district descriptions for the 1950 Census records. This is approximately 10 times the size of the 1940 Census dataset.
Included in the dataset are approximately:
6.5 million digital TIFF images and corresponding JPEG derivative images of the microfilmed “1950 Census of Population and Housing” forms for U.S. states and territories
33,215 TIFF images and corresponding JPEG derivative images of the original paper “1950 Census of Population and Housing: Indian Reservation Schedule” forms
9,600 digitized images of the 1950 Census Enumeration District Maps, which are annotated maps of counties, cities, and other minor civil divisions that show enumeration districts, census tract, and related boundaries and numbers used for each census
63,000 digitized images of the 1950 Census Enumeration District Descriptions, which are written descriptions of geographic areas included within enumeration districts
232,000 1950 Census Enumeration District Descriptions, which were produced by generating OCR output of the Enumeration District Description images. More than 25 NARA staff reviewed and cleaned up the OCR output.
Gary Price (gprice@mediasourceinc.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. Before launching INFOdocket, Price and Shirl Kennedy were the founders and senior editors at ResourceShelf and DocuTicker for 10 years. From 2006-2009 he was Director of Online Information Services at Ask.com, and is currently a contributing editor at Search Engine Land.
Full Text of a Texas A&M University Libraries Announcement: We are pleased to announce the appointment of Julie Mosbo Ballestro as University Librarian and Assistant Provost of University Libraries at ...
From an EBLIDA (European Bureau of Library, Information and Documentation Associations) Post: EBLIDA is laying the foundation for “sustainable copyright” in public libraries through the publication of the “First European ...
The Rare Book School (U. of Virginia) video embedded below (a National Endowment for the Humanities-Global Book Histories Initiative Lecture by Kelly Wisecup) was recorded on June 15, 2022. From ...
From a DPLA Announcement: Digital Public Library of America (DPLA) is pleased to announce an $850,000 grant from the Mellon Foundation to support its effort to advance racial justice in ...
Coherent Digital Launches South Asia Archive on the Coherent Commons Platform The Longest-Running Queer News Radio Show Is Headed to the Library of Congress (via NPR) University of Cambridge Now ...
From ArchDaily: As gateways to knowledge and culture, libraries play a fundamental role in society. Foundational in creating opportunities for learning, as well as supporting literacy and education, the resources ...
From the Associated Press: A roundup of some of the most popular but completely untrue stories and visuals of the week. None of these are legit, even though they were ...
Full Text of ALA Statement (6/24): In response to the alarming increase in acts of aggression toward library workers and patrons as reported by press across the country, the American ...
FCC and IMLS Sign Agreement to Promote Broadband Access More Than Fifty Libraries and Library Systems Live on EBSCO FOLIO Library Services Platform NIST Releases New Guidance and Resources on ...
From the Associated Press (via Times of Israel): Pope Francis orders the online publication of 170 volumes of its Jewish files from the recently opened Pope Pius XII archives, the ...
From NYPL: The virtual branch— a custom designed interactive AR (Augmented Reality) Effect accessible via Instagram Reels is the centerpiece of #NYPLSummerBookshelf, a new initiative to spark a love of ...
CLIR Invites Proposals for Pocket Burgundy Series (via Council on Library and Information Resources) Oregon’s State Library added to National Register of Historic Places (via Oregon Capital Chronicle)