Smithsonian Institution Archives
  • Collections
  • Services
  • Smithsonian History
  • About
  • Education
  • Blog
  • Forums
  • Press
  • Audiences
  • Donate

The Bigger Picture: Visual Archives and the Smithsonian

Making Sense of Data That’s Linked and Open

by Effie Kapsalis on June 23, 2011

 The author’s father working with a Liquid Scintillation Spectrometer which apparently measured chemical compounds, University of Athens, c. 1950s, courtesy of Effie Kapsalis.

If you are a regular reader, or someone who works for a museum, library, or archive, you intimately understand the difficulty in managing big collections. If you’re not in this world, you do understand how hard it is to manage family photographs, a collection of email love letters, or the folder tucked in the bottom of your closet with old college papers. When you multiply this by, oh say six thousand people (the current number of employees at the Smithsonian which was founded in 1846), you’ll get the level of complexity we’re dealing with at the Smithsonian Institution Archives.

Something I think about often (and also happen to be responsible for at the Archives) is making these big collections more accessible and engaging online. Because of that, I was happy to be accepted to attend the recent Linked Open Data in Libraries, Archives, and Museums Summit to get a better grasp on how we can share our information and resources not only on our websites, but with other cultural heritage institutions.

I had a vague notion of what Linked Open Data meant, mostly in that it relates to this thing called the semantic web, or the concept of having a web of linked data that can be easily accessed and processed by machines. Still, I was looking for something more tangible, so I went back to a notable TED talk (see bottom of this post) by the semantic web guru, Tim Berners-Lee. In 1989, Berners-Lee defined the basic building blocks for the web; HTML and URLs. He sent the simple and brilliant idea in a memo to his boss at the European Particle Physics Laboratory in Geneva, Switzerland, and was given the permission to work on it… which he did and here we are today.

Recognize this cloud? It’s the “Linking Open Data Cloud Diagram,” by Richard Cyganiak and Anja Jentzsch.

In his fascinating talk, he tells us how hard it was to get people to understand the concept of the worldwide web in the 1980's. In fact, his first demonstration — a page, with a hyperlink, linked to another page — was not exactly an attention-grabber. But as we increasingly go online to make phone calls, read books, and collaborate, we feel the power of his idea daily.

Now, Berners-Lee wants us to put not just our documents online, but also our data so it can be accessible, re-purposed, and understood. He explains it much better than me, but he starts to reveal the possibilities as he defines three rules for putting data on the web:

  1. Http names should refer to people, places, events, products, etc, and not just documents.
  2. When people access those http names, they should get important information back in a standardized format so it’s comprehensible and shareable.
  3. The data people get back should also define relationships to other people, events, places, and things with an http name.

Sounds as simple as his notion for URLs and HTML? We can only surmise how little we are understanding of his suggestions.

At LOD-LAM, we were trying to figure out what this all means for cultural heritage organizations. We do have tons of data which is often housed in several institutions. However, I found myself wondering what a large aggregation of data would do for our visitors. Sometimes too much is just too much.

There are some initial examples out there to wrap your head around. Check out the Civil War Data 150 project. There are several institutions that have records on the Civil War, but they seldom bring them together in one place. This project will aggregate data from various institutions and use that data to define a common language for things like battles, regiments, and officers, etc. Basically, it will take all the work these individual institutions have done and create a standard vocabulary. Then, the project members are going to enlist classrooms to help tag collection objects and records with these names, which will enable these different collections to play nicely together. And since the data is open, other people and organizations will be able to understand it, ingest it, and put it out in different formats;  maps, online publications, and in other ways we haven’t even imagined.

If you think about all the data available in the world, the possibilities are endless.  As Berners-Lee summarizes, you don’t even have to be a big player to contribute. It’s about people doing their bit to create a bigger resource (see Open Street Map for one example).

Categories: Behind the Scenes
Tags: Web/Tech, Archive
Comments: View 6 comments, or Give us yours!
All comments are moderated and subject to approval. Further information is available in The Bigger Picture’s Commenting Guidelines.

Comments (6) – Leave a comment

Jennifer Gal

What fascinates me about this interconnected online researcher's paradise based on LOD relationships, as Ms. Gonzalez describes it in her blog article, is not so much its use as a logistical tool to manage large collections, but its implications for machine learning. The EU project, Linking Open Data 2, is working on algorithms based on machine learning for automatically interlinking and fusing data from the Web. Maybe we can only surmise how little we are understanding of Berners-Lee's suggestions, but what I find so exciting about them is what they may promise for making the Web exponentially smarter, all by itself. Thanks for your fine article.

Jennifer Gal June 27, 2011 at 11:10 pm
  • reply
Maureen

Excellent post. A person who does a lot of research online, I am familiar with the issues of aggregated data. When I was doing a series on each state's poet laureate, I found, as expected, there was no one "best" place for information and often uncovered surprisingly good information 5 or more pages into search results. I vetted every link and then created a post for each poet pulling together the best links I found, together with my own commentary about the poet. A number of the state PLs wrote to thank me, to my delight. I've been impressed by some of the work museums are doing to pull together art information, for example. Great stuff!

Maureen June 23, 2011 at 10:01 am
  • reply
Effie Kapsalis

Maureen, It sounds like you were fulfilling a role many cultural heritage institutions don't have the time nor resources to accomplish. I also really appreciate when researchers help us see our resources in new ways. And, I completely agree that when are resources are brought together, they are so much richer. We'll keep working at it! Best, Effie

Effie Kapsalis June 24, 2011 at 8:01 am
  • reply
Gloria Gonzalez

Thanks for such a great post! I'm a Junior Fellow for NDIIPP this summer, and I've been researching and writing about LOD. I've found it to be extremely interesting. From what I can tell LOD-LAM summit was extremely productive. I'm so glad that initiatives are being taken to improve the state of the internet. If you're interested, here's a link to what I wrote about LOD for The Signal: http://blogs.loc.gov/digitalpreservation/2011/06/linked-open-data-a-beck...

Gloria Gonzalez June 23, 2011 at 9:57 am
  • reply
Effie Kapsalis

Gloria - Thank you for getting in touch! Your article raises some important points to a big obstacle in making LOD a reality - legal issues around data ownership. It's going to be a big barrier to getting data sets out there, but luckily there are some brave organizations taking the first steps. I follow the NDIIPP projects and look forward to hearing more! Effie

Effie Kapsalis June 24, 2011 at 7:58 am
  • reply
Effie Kapsalis

Jennifer, Agreed that machine learning could be incredible. People will be able to write scripts that point out patterns in the data that we couldn't have seen if we looked at these collections in isolation. Throw visualization on top of that, and it gets pretty powerful. See a related post about a visualization tool released by the Library of Congress that allows you to take data sets and put them up as maps, graphs, etc. /2011/06/27/shape-shifting-and-sharing-data/ Effie

Effie Kapsalis June 28, 2011 at 12:09 pm
  • reply

Leave a comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
By submitting this form, you accept the Mollom privacy policy.

Produced by the Smithsonian Institution Archives. For copyright questions, please see the Terms of Use.

Stay in touch!

Facebook Twitter Flickr YouTube SlideShare
Join our eNewsletter

About

Connecting you to America’s past with a behind-the-scenes exploration of the Smithsonian’s history, treasures, and the challenges that Archives face preserving collections. More details...

Smithsonian on Flickr Commons

Topics/Tags

  • See Here (611)
  • American History (542)
  • Science (429)
  • Archive (329)
  • Cities/Places (277)
  • Exhibitions (234)
  • Web/Tech (210)
  • Photo History (189)
  • Link Love (153)
  • Politics/Government (153)

Blog Roll

All Smithsonian blogs
American Historical Association Blog
American Institute of Conservation Blog
Archives Next
Archives of American Art
Around the Mall
Field Book Project
Hanging Together
Library of Congress Blogs
National Archives (US) Blogs
National Museum of American History, O say can you see?
Smithsonian Collections Blog
Smithsonian Libraries
Teaching American History

Categories

  • Collections in Focus (989)
  • What Gets Saved (337)
  • Behind the Scenes (212)
  • Smithsonian History (134)

Recent Posts

  • Sneak Peek 5/20/2013
  • Link Love: 5/17/2013
  • See Here: 5/17/2013
  • Weird and Wonderful: The Surprising Mrs. Hilda Hempl Heller
  • Women in Science Wednesday: Anne Hagopian

Monthly Archive

  • May 2013 (21)
  • April 2013 (26)
  • March 2013 (26)
  • February 2013 (26)
  • January 2013 (28)
  • December 2012 (26)
  • November 2012 (28)
  • October 2012 (32)
  • September 2012 (26)
  • August 2012 (31)
  • July 2012 (26)
  • June 2012 (27)
  • May 2012 (27)
  • April 2012 (27)
  • March 2012 (28)
  • February 2012 (27)
  • January 2012 (26)
  • December 2011 (31)
  • November 2011 (28)
  • October 2011 (35)
  • September 2011 (31)
  • August 2011 (35)
  • July 2011 (41)
  • June 2011 (43)
  • May 2011 (33)
  • April 2011 (40)
  • March 2011 (43)
  • February 2011 (35)
  • January 2011 (36)
  • December 2010 (42)
  • November 2010 (40)
  • October 2010 (44)
  • September 2010 (37)
  • August 2010 (39)
  • July 2010 (38)
  • June 2010 (37)
  • May 2010 (42)
  • April 2010 (44)
  • March 2010 (47)
  • February 2010 (40)
  • January 2010 (39)
  • December 2009 (43)
  • November 2009 (34)
  • October 2009 (11)
  • September 2009 (11)
  • August 2009 (12)
  • July 2009 (14)
  • June 2009 (10)
  • May 2009 (12)
  • April 2009 (14)
  • March 2009 (10)
  • January 2009 (1)
Smithsonian Institution Archives
eNewsletter Facebook Twitter Flickr Historypin YouTube SlideShare Browsealoud
Smithsonian Institution
  • Privacy
  • Copyright
  • Contact