With the scheduled April 1, 2022, release of 1950 Census records a little more than three months away, the National Archives is completing efforts to digitize those records and using technology to make them more accessible than ever.
[Clip]
The new website will include a name search function powered by an Artificial Intelligence/Machine Learning (AI/ML) and Optical Character Recognition (OCR) technology tool. This is important for genealogists and other researchers who rely on census records for new information about the nation’s past.
“The OCR being used to transcribe the handwritten names from the census rolls is about as good as the human eye,” said Project Management Director Rodney Payne. “Some of the pages are legible, and others are difficult to decipher. So, the National Archives developed a transcription tool to enable users to submit name updates. This will allow other users to find specific names more easily, and it provides an opportunity for the public to help the agency share these records with the world.”
National Archives officials are encouraging interested members of the public to use the transcription tool and assist the agency to make the records as accurate as possible.
[Clip]
The National Archives is also working to provide bulk download access of the full 1950 Census dataset on launch day. This will be of interest to digital humanists, web developers, social scientists, and anyone wanting to explore aggregations of the records. Other organizations and companies will be able to use this functionality to provide 1950 Census data on their own websites.
When made available on the Amazon Web Services Registry of Open Data, the 1950 Census dataset—over 165 terabytes of data—will include the metadata index, the population schedules, the enumeration district maps, and the enumeration district descriptions for the 1950 Census records. This is approximately 10 times the size of the 1940 Census dataset.
Included in the dataset are approximately:
6.5 million digital TIFF images and corresponding JPEG derivative images of the microfilmed “1950 Census of Population and Housing” forms for U.S. states and territories
33,215 TIFF images and corresponding JPEG derivative images of the original paper “1950 Census of Population and Housing: Indian Reservation Schedule” forms
9,600 digitized images of the 1950 Census Enumeration District Maps, which are annotated maps of counties, cities, and other minor civil divisions that show enumeration districts, census tract, and related boundaries and numbers used for each census
63,000 digitized images of the 1950 Census Enumeration District Descriptions, which are written descriptions of geographic areas included within enumeration districts
232,000 1950 Census Enumeration District Descriptions, which were produced by generating OCR output of the Enumeration District Description images. More than 25 NARA staff reviewed and cleaned up the OCR output.
Gary Price (gprice@gmail.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area.
He earned his MLIS degree from Wayne State University in Detroit.
Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com. Gary is also the co-founder of infoDJ an innovation research consultancy supporting corporate product and business model teams with just-in-time fact and insight finding.
From the Associated Press: A roundup of some of the most popular but completely untrue stories and visuals of the week. None of these are legit, even though they were ...
From a CRS “In Focus”: The recent public release of many GenAI tools, and the race by companies to develop ever-more powerful models, have generated widespread discussion of their capabilities, ...
From the MS Bing Blogs: Microsoft Maps has a dedicated Maps AI (artificial intelligence) team that has been taking advantage of Microsoft’s investments in deep learning, computer vision, and ML ...
Broward County, Florida: “‘I Read Banned Books’ Library Cards Spark Support and Outrage” (via WFTS) Librarians Strike Back Against Comics Bans (via PW) Michigan: Book Bans Discussed on Michigan Public ...
From the St. Louis Post-Dispatch: St. Louis-area librarians are confident their children’s sections don’t include — and never have — obscene materials, but they are spending hours examining policies to make ...
From University of Chicago News: In the fall of 2016, Carla D. Hayden had just been confirmed as the 14th librarian of Congress—the first woman and the first African American to hold ...
Fron ALA (Full Text): The American Library Association (ALA) applauds the Biden-Harris Administration’s steps announced today to address the rise in book bans and other attacks on LGBTQIA+ Americans. In ...
Association of College & Research Libraries (ACRL) ACRL Executive Director Robert “Jay” Malone is Leaving Organization, Will Be Succeeded by Interim Executive Director Allison Payne (via ALA) Databases CiteScore 2022 ...
From IMLS: The Institute of Museum and Library Services announced today the release of a research brief on the public library response to community needs during the first 9 months ...
From CBS News (via YouTube): Poet and author Amanda Gorman joins “CBS Mornings” for her first interview since her poem and book, “The Hill We Climb,” was restricted by a ...
From a Joint Announcement: U.S. Government Publishing Office (GPO) in partnership with the National Oceanic and Atmospheric Administration (NOAA) Central Library is working to add more than 47,000 unique items ...
From a Nature Editorial: Why are we disallowing the use of generative AI in visual content? Ultimately, it is a question of integrity. The process of publishing — as far ...