The Bigger Picture: Visual Archives and the Smithsonian
Posts tagged with: Web/Tech
In just a handful of decades, our society has gone from hearing about the impending miracles of the digital age to daily lives permeated with digital culture. As a result, digital objects have become part of the Smithsonian’s historical record with its digital archives managed and preserved by the Smithsonian Institution Archives (SIA). Rarely do born digital holdings arrive carefully set to the side with documentation about what is on the storage media and with a backup or copy. At the Archives today, one out of three accessions will contain born digital material, most commonly found mixed in with the paper files.
Similarly other archives at the Institution have been steadily acquiring born digital holdings over the past several decades. Four years ago, the Smithsonian Institution Archives and archives within the National Museum of Natural History (National Anthropological Archives, Human Studies Film Archive), the National Air and Space Museum, the Archives Center at the National Museum of American History, the Archives of American Art, and the National Museum of African American History and Culture, gathered to frame out a collaborative survey of their born digital holdings. Key goals of this effort were to uncover hidden holdings, establish physical and intellectual control of born digital material, and to perform a baseline preservation assessment, thereby strengthening the collections care provided. An integral part of the survey’s design is its shared methodology and metrics which can then serve as a foundation for future joint preservation initiatives and stewardship planning.
Receiving its first grant in 2012, the survey work focused initially on building an inventory of removable storage media present in each archive while completing questionnaires that evaluated the preparedness of the archives to manage these types of collections. A second grant was received in 2014 to complete the survey work, perform risk analysis at the individual file level and provide essential interventions to stabilize these fragile materials. Completed in April 2015, the resulting qualitative and quantitative insights are being incorporated into the collections stewardship planning of the participating archives and museums.
Leveraging familiar waters
Established eleven years ago, the Archives’ Electronic Records Program (ERP) conducted its first born digital holdings survey in 2004-2005. As a result, changes were made to the acquisition, processing and preservation workflows to achieve best practices for holdings that can vary dramatically in formats, age, and quantity. What started initially as documents, spreadsheets, and simple databases from the late 1990’s, has now grown to include images, audio, video, mobile apps, websites and social media, construction drawings, GIS data, email accounts, scientific data sets, and even custom built software programs with an estimated half a terabyte of new born digital holdings acquired each year.
The Electronic Records Archivist Lynda Schmitz Fuhrig and ERP volunteer Peter Finkel assisted regularly throughout the survey and continue, along with the shared workflows and software tools, to serve as mentors and a common resource to the survey’s participating archives.
In many ways, the survey implemented the principles laid out in Ricky Erway’s white paper, "You've Got to Walk Before You Can Run".
Determining levels of risk
Preservation risk for content on media that could be read was determined on the basis of format and age, creating a simple mechanism to rank individual files:
- Severe (1) indicated files older than 10 years and whose format the participating archive was unable to access.
- High (2) indicated files younger than 10 years and whose format the participating archive was unable to access.
- Medium (3) indicated files older than ten years yet were in formats that the participating archive was able to access.
- Low (4) indicated files younger than ten years in formats that the participating archive was able to access.
Taken as a whole, risk was distributed 14% Severe, 5% High, 43% Medium and 38% Low according to the image below:
Over 470 accessions were inspected, 6,613 pieces of removable media inventoried, and 651,629 born digital files assessed for preservation risks. Concurrently, the assessed files were stabilized. That is to say, they were scanned for viruses, their fixity values determined, backups made into secure storage environments, and metadata generated such that a minimum of bit-level preservation of well-defined holdings is now in effect. Combined with the portion of SIA holdings that had already been assessed and preserved prior to the survey, close to 1.5 million born digital holdings across six archives are now under proper archival control. Placed in the context of the recently published [POWRR framework], the progress made by this survey is striking.
State of born digital holdings preservation among survey participants of 2012:
State of born digital holdings preservation among survey participants at the survey conclusion:
We are excited at the enduring effect this survey will have on the born digital holdings within Smithsonian collections and their stakeholders, as well as the stewardship community and the born digital advocacy it empowers.
- Erway, Ricky. "You’ve Got to Walk Before You Can Run: First Steps For Managing Born Digital Content Received on Physical Media." OCLC, 2012
- Schumacher, Jaime et al. "From Theory to Action: Good Enough Digital Preservation for Under-Resourced Cultural Heritage Institutions." Northern Illinois University. Captured March 31, 2015
A little under a year ago, we rolled out a new search for our site which is powered by the Google Search Appliance. The goal of implementing this new search was to make our content and collections more accessible, to make discovery easier, and to generally improve the user experience.
Work towards that goal didn't end a year ago.
Over the summer of 2014, work by our staff began on making PDFs of the Smithsonian staff newsletter, The Torch, text-searchable. Because these PDFs can be read by our Google Search Appliance's bots, their content can be indexed. This means that our site search will return any Torch issue that matches your search string.
Let's say you're doing some research on Smokey the Bear. So you head over to our website, and search for "Smokey." You'll be presented with a familiar search results screen (one of which is actually a link to a Torch PDF). But let's say you didn't want to see finding aids or collection items, just the PDFs. Don't worry, you can do that too.
You may have noticed there's a new link at the top of the content type filters, labeled "PDFs." In the above example, the site would return only PDFs that match the search string "Smokey," such as an article about if Smokey should be retired and the original Smoke's obituary.
- You Asked, We Listened: Introducing the Archives New Site Search, The Bigger Picture blog, Smithsonian Institution Archives
- Smithsonian Institution Archives Moves to Drupal 7, The Bigger Picture blog, Smithsonian Institution Archives
- And now you see it - The oldest microscope at the National Museum of American History. [via O Say Can You See? blog, NMAH]
- A previously misfiled fossil leads to the revelation that the prehistoric reptile, known as a mosasaur, gave birth in the open ocean rather then lay eggs. [via Smithsonian Science News]
- From the Archives of American Art - A look at how the life and work of artist Miné Okubo was affected by being detained in the Topaz internment camp in Utah during World War II. [via Archives of American Art Blog]
- It's official - It was announced this week that President Barack Obama's Presidential Library will be located in Chicago. [via InfoDocket]
- Happy 50th Anniversary to the Smithsonian Environmental Research Center! This Saturday, May 16, they are hosting an open house at their location in Edgewater, Maryland. [via SERC]
- There's an app for that - Yale University released an app that builds on the Map of Life’s integrated global database of everything from bumblebees to trees, which tells users which species are likely to be found in their vicinity. [via Yale News]
- Congratulations to the University of Pennsylvania who recently acquired a copy of Jacques Barbeu-Dubourg’s Petit Code de la raison humaine, a book printed in France by Benjamin Franklin in 1782. One of only four known surviving copies, its acquisition by Penn adds to its collection of more than 330 works printed by Franklin. [via InfoDocket]
- New from the National Museum of African Art - Its first graphic novel, The Song of Lionogo, which is based on a Swahili mythological figure from East Africa and was inspired by the cultural connections between the Arabian Peninsula and the Indian Ocean. [via NMAfA]
- My how far we've come - A new website allows you to see when your digital images would lool like rendered on an old Commodore 64 computer. [via PetaPixel]
- In their own words, oral histories at the Archives of American Art shed light on the artistist, Yasuo Kuniyoshi, in their exhibition, Artist Teacher Organizer: Yasuo Kunioshi in the Archives of American Art. [via Archives of American Art Blog]
- Watch out manuscripts, the next step: Handwritten Text Recognition! [via InfoDocket]
- This week the Cooper Hewitt, Smithsonian Design Museum announced the winners of the 16th Annual Desgin Awards. [via Cooper Hewitt, Smithsonian Design Museum]
- Four basic steps - Archiving the Arthur C. Clarke Collection. [via AirSpace Blog, NASM]
- Lonnie Bunch, the founding director of the National Museum of African American History and Culture, spoke with Smithsonian Magazine about the Baltimore protests, the role of museums during times of upheaval, and the National Museum of African American History and Culture’s plans for the future. For more from Lonnie Bunch about the museum, please see the video below. [via Smithsonian Magazine]
"Smithsonian Enters Cyberspace with Information-Packed World-Wide Web Home Page" announced the press release.
Tomorrow marks the 20th anniversary of the Smithsonian's first "internet 'web' site" on May 8, 1995. The web site included more than 1,500 pages and overviews of the site were available in Spanish, German, and French. In addition to text and graphics, the pages also included images, audio, and video. Peter House, the National Science Foundation staff member who was detailed to the Smithsonian for the technical development of the website, considered the site to be very large at the time.
The Smithsonian Home Page was designed to allow users to visit the Smithsonian in much the same way as they would in person. Users can begin by viewing general information pages, just as many visitors begin with the information center in the Castle, or they can go directly to page for an individual museum. Many of the Smithsonian's museums and other facilities established home pages at the same time.
A sneak preview of "Ocean Planet On-Line" was available several weeks ahead of the Smithsonian Home Page. It was demonstrated during the press preview of the "Ocean Planet" exhibition at the National Museum of Natural History, held April 20, 1995. The website was a joint project of the Smithsonian's Environmental Awareness Program and the National Aeronautics and Space Administration (NASA). Gene Feldman, an oceanographer at the Goddard Space Flight Center and creator of the online exhibition, described the site as "one of the most comprehensive and advanced exhibitions available through the Internet via the World Wide Web." He believed it had "capabilities that will amaze even the tekkies." Although hosted on a NASA server, "Ocean Planet On-Line" was considered to be a component of the larger Smithsonian website. It still exists today in close to its original form.
The Smithsonian Home Page included multimedia messages from the Secretary, general information, frequently asked questions (known as "Encyclopedia Smithsonian"), press releases, museum highlights, online exhibitions, virtual museum tours, a staff directory, and the "electronic Shopping Mall." The "Perspectives" section of the site allowed users to search for specific topics across the entire website. Many of these features still exist, in an updated form, in the current Smithsonian website.
In the first 24 hours after the home page was launched, it received approximately 100,000 hits, some as far away as Japan. By May 17, 9 days after the launch, there had been over 600,000 hits.
Secretary Heyman noted that "James Smithson's goal of the 'increase and diffusion of knowledge' has been reborn for a new century."
According to House, "The Smithsonian has been waiting 150 years for the Internet. What we do here is perfect for it."
- Tracking Down the Elusive 'Treasure House of Learning', The Bigger Picture blog, Smithsonian Institution Archives
- Accession 98-094 - Office of the Secretary, Smithsonian Website Records, 1995, Smithsonian Institution Archives
- Accession 01-081 - Smithsonian Institution, Office of Public Affairs, The Torch, 1994-1999, Smithsonian Institution Archives
- Accession 12-545 - National Museum of Natural History, Office of Public Affairs, Press Releases, 1992-2002, Smithsonian Institution Archives
- Historic Smithsonian Home Pages on the Internet Archive Wayback Machine and on Archive-It
- 1 of 62
- next ›