Smithsonian Institution Archives
  • Collections
  • Services
  • Smithsonian History
  • About
  • Education
  • Blog
  • Forums
  • Press
  • Audiences
  • Donate

The Bigger Picture: Visual Archives and the Smithsonian

To Preserve or Not to Preserve: Social Media

by Jennifer Wright on June 13, 2012

The Smithsonian Institution maintains over five hundred social media social networking, and other “web 2.0” accounts. Many of them are listed at http://www.si.edu/connect.

The Smithsonian Institution currently has over five hundred social media, social networking, and other "web 2.0" accounts (many of them are listed on si.edu’s "Connect" page). These accounts include approximately 143 Facebook accounts, one hundred Twitter accounts, seventy-four blogs, sixty-six Flickr accounts, and sixty-one YouTube accounts. These accounts are used for public outreach and to bring attention to the Smithsonian’s objects, exhibitions, research, programs, projects, events, activities, staff, and educational resources . Each of these accounts focuses on a different audience and specializes in a unique topic.

These vehicles for engaging audiences may be new, but the Smithsonian has always performed this sort of outreach using a variety of means including articles in scholarly journals, Smithsonian-published magazines, newsletters, press releases, teacher packets, email lists, and websites. Since all of these have been considered historically valuable materials, it is only logical that the Smithsonian’s social media accounts should also be preserved.

Preservation of any type of digital record is more complicated than paper preservation and requires more resources over time. Keeping this in mind, we closely look at the records to determine if we need to preserve them in their entirety. Our goal is to preserve enough data to satisfy the needs of future researchers while minimizing the amount of duplicate, extraneous, and less historically valuable data. We attempt to find this "happy medium" as part of our appraisal process—the process by which we determine what will become part of the Archives’ collections.

When we appraise social media accounts, we look at each account individually because they are all used differently. Some accounts contain mostly original content or other information that is not quickly and easily available elsewhere. Other accounts consist primarily of links to the Smithsonian’s own websites or to news articles or the websites of other organizations. Many social media accounts fall somewhere in the middle. A major factor in how we appraise a social media account is the amount of significant original content it includes. This is more of an art than a science and we attempt to err on the side of caution.

The Archives of American Art Facebook page was the first we preserved in the new “Timeline” format. Screenshots were taken of the account on March 5, 2012 and then saved as a PDF/A document (a file format which is the archival standard for the digital preservation of documents).

Social media accounts with significant original content are captured in full or at least back to the last time they were captured. Social media accounts with little original content are also captured and preserved to document their existence and how they were used, but we will generally only capture a sample of the account, such as two or three months of a Facebook timeline.

There are other ways to minimize the amount of data we are preserving from the social media accounts. Some accounts are structured in such a way that the content and metadata can be exported as a spreadsheet or XML document. Twitter is a good example. The size of these documents is often much smaller than the data collected by crawling the account. We will also preserve a screenshot of the account to document its look. For accounts with more complicated structures, we will often look at the entire account and determine if there are pieces that are not necessary to preserve. Oftentimes photographs, videos, or calendar of events uploaded to the account are also available on a Smithsonian website or publication which is also being preserved. In some cases, these duplicate items can be excluded when we capture the account.

The National Museum of African Art’s Twitter account, exported as an XML document  on May 18, 2012 using a tool called Grabeeter. All of the account’s Tweets, along with associated metadata such as date and time, are separately tagged. Moving forward, we will likely develop a stylesheet to display the information in a more legible format.

Another major concern when appraising social media is privacy. Personal information is everywhere in social media applications and we do our best to minimize the amount of that information that we capture and preserve. We avoid capturing content outside of the scope of the Smithsonian-administered account, meaning that we do not capture the profiles or accounts of the individuals who like, follow, or connect with Smithsonian accounts. That does not mean that we do not capture any personal information. For instance, if you comment on a blog or a Facebook post, the text of your comment as well as your name, profile picture, and any other publicly displayed information will likely be captured. However, if we feel that too much personal information would be disclosed by capturing the account, we do not capture it.

While the popularity of individual social media providers will likely fade over time and become just a blip in web history, they exemplify current and future trends in communication. By capturing and preserving the Smithsonian’s social media presence, we are continuing to document the evolution of the Institution’s methods of sharing information and engaging new audiences.

Related Resources

  • The Smithsonian: Using and Archiving Facebook, The Bigger Picture Blog, Smithsonian Institution Archives
  • Smithsonian Institution Archives' Appraisal Methodology, PDF
Categories: What Gets Saved
Tags: Web/Tech, Archive, Behind the Scenes
Comments: View 9 comments, or Give us yours!
All comments are moderated and subject to approval. Further information is available in The Bigger Picture’s Commenting Guidelines.

Comments (9) – Leave a comment

Jessica Smith

Preserving is a good practice. I adore Smithsonian on continuing to document the evolution of the Institution’s methods of sharing information and engaging new audiences specially that large number of social media users are Youths. It's a way to reach more minds and spread valuable information to new generation.

Jessica Smith June 14, 2012 at 3:07 am
  • reply
Brad

Popularity will fade…just a blip…interesting perspective. The topic of social media preservation is important to preserve and document our time in history. Not just the Smithsonian accounts, but the preservation of social media in a wider arena.
Social media captures communication that during other time periods in history may never have been documented. Some of the social media is conversational, information that would not have been historically preserved. Just think if we had a few tweets from our predecessors as they were hunting a wooly mammoth.
Communication contained in social media is frequently original content that gets duplicated and shared. With the interconnectivity of social media it would seem to be very difficult to capture only the original content.
Social media communication may not be documented anywhere else. There’s not going to be a box of handwritten letters found describing the last flight of the space shuttle.
It’s real. Social media is real time feelings, thoughts and conversations. Not years of editing and formal re-writes. Raw comments, thoughts and feelings. Seems as valuable as any printed newspaper.

Brad June 14, 2012 at 8:47 am
  • reply
Chris Nosal

I think digital media is really no different than traditional media; it's creative art that appeals to our 5 senses... I like the idea of them saving things that are considered to have cultural and relevant significance.

Chris Nosal June 14, 2012 at 9:39 am
  • reply
Kathleen Williams

Informative piece, Jennifer and SIA. Thanks for sharing SIA's approach to the appraisal and capture of SI's social media!

Kathleen Williams June 14, 2012 at 10:12 am
  • reply
Lloydy912

I was amazed by what I learn from your article. I thought I can only preserve paper memories like journals and scrapbook. Yet, it is nice to know that I can also preserve social media profile.

Lloydy912 June 15, 2012 at 5:51 am
  • reply
Colin Rosenthal

Generally speaking - and simplifying drastically - I think that harvesting social media sites is essentially not particularly hard. If a browser can render a page then it ought to to be possible to automate that process in heritrix or some other headless-browser and save the result to disk. Having an API makes it even easier.

The real issue is how to replay the results to the "end user". The essential problem is that these sites (twitter even more so than facebook) are not "websites" at all. Fundamentally twitter _is_ its API, and there is no uniform way to preserve the experience of its users who are using 100s of different clients to access it. In addition, twitter is highly linked - click on a tweet, a hashtag, a username and some highly complex javascript magic suddenly produces a whole new view of your data. Just try preserving that behaviour :-) It's not to hard to preserve and replay a static view of a single twitter search/listing or facebook page, but to what extent does that actually preserve anything of what's really interesting about these media?

I don't claim to have the right answer but I think any serious attempt to preserve them has to be based on some kind of mixed strategy - perhaps using an API for discovery (using keywords, location tagging etc.), a web-crawler for harvesting, and ideally a tailor-made client for archival-browsing. Now who has the resources to build all that? And to maintain it every time twitter/facebook change their API or when the next big thing (pinterest?) comes along?

Colin Rosenthal June 18, 2012 at 5:32 am
  • reply
George Lungu

After the Facebook debut on the stock market I believe the answer to this one became a no-brainer: Preserve.

George Lungu June 20, 2012 at 7:07 pm
  • reply
Kaushik Biswas

Most of us, individuals, never preserve our social media activity because there is no easy way known to us to do this. But preserving is important, especially for Smithsonian, when it contains original media. There is also a cost factor involved, because if you need to preserve images & videos, it will require some gigabytes of disk space. I don't know how you do it, may be another detailed explanation will be a helpful clue for us.

Kaushik Biswas August 6, 2012 at 10:58 pm
  • reply
Nikhil

Its nice to see your institute being more than a active member in the field of social media. Also we don't get to see too many organisations or institutions revealing their social media count which are preserved for marketing purpose.

Nikhil November 5, 2012 at 8:03 am
  • reply

Leave a comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
By submitting this form, you accept the Mollom privacy policy.

Produced by the Smithsonian Institution Archives. For copyright questions, please see the Terms of Use.

Stay in touch!

Facebook Twitter Flickr YouTube SlideShare
Join our eNewsletter

About

Connecting you to America’s past with a behind-the-scenes exploration of the Smithsonian’s history, treasures, and the challenges that Archives face preserving collections. More details...

Smithsonian on Flickr Commons

Topics/Tags

  • See Here (612)
  • American History (544)
  • Science (431)
  • Archive (332)
  • Cities/Places (279)
  • Exhibitions (235)
  • Web/Tech (211)
  • Photo History (189)
  • Link Love (154)
  • Politics/Government (153)

Blog Roll

All Smithsonian blogs
American Historical Association Blog
American Institute of Conservation Blog
Archives Next
Archives of American Art
Around the Mall
Field Book Project
Hanging Together
Library of Congress Blogs
National Archives (US) Blogs
National Museum of American History, O say can you see?
Smithsonian Collections Blog
Smithsonian Libraries
Teaching American History

Categories

  • Collections in Focus (991)
  • What Gets Saved (338)
  • Behind the Scenes (212)
  • Smithsonian History (136)

Recent Posts

  • See Here: 5/24/2013
  • Link Love: 5/24/2013
  • "If you feed them, they will come."
  • Women in Science Wednesday: Mary Alice McWhinnie
  • Twenty-Six and Blooming!

Monthly Archive

  • May 2013 (26)
  • April 2013 (26)
  • March 2013 (26)
  • February 2013 (26)
  • January 2013 (28)
  • December 2012 (26)
  • November 2012 (28)
  • October 2012 (32)
  • September 2012 (26)
  • August 2012 (31)
  • July 2012 (26)
  • June 2012 (27)
  • May 2012 (27)
  • April 2012 (27)
  • March 2012 (28)
  • February 2012 (27)
  • January 2012 (26)
  • December 2011 (31)
  • November 2011 (28)
  • October 2011 (35)
  • September 2011 (31)
  • August 2011 (35)
  • July 2011 (41)
  • June 2011 (43)
  • May 2011 (33)
  • April 2011 (40)
  • March 2011 (43)
  • February 2011 (35)
  • January 2011 (36)
  • December 2010 (42)
  • November 2010 (40)
  • October 2010 (44)
  • September 2010 (37)
  • August 2010 (39)
  • July 2010 (38)
  • June 2010 (37)
  • May 2010 (42)
  • April 2010 (44)
  • March 2010 (47)
  • February 2010 (40)
  • January 2010 (39)
  • December 2009 (43)
  • November 2009 (34)
  • October 2009 (11)
  • September 2009 (11)
  • August 2009 (12)
  • July 2009 (14)
  • June 2009 (10)
  • May 2009 (12)
  • April 2009 (14)
  • March 2009 (10)
  • January 2009 (1)
Smithsonian Institution Archives
eNewsletter Facebook Twitter Flickr Historypin YouTube SlideShare Browsealoud
Smithsonian Institution
  • Privacy
  • Copyright
  • Contact