The goal and vision of the Semantic Web is to create a Web of connected and interlinked data (items) which can be shared and reused by all. Sharing and opening up “raw data” is great; but the Semantic Web isn’t just about sharing data. To create a Web of data, one needs interlinking between data. In 2006, Sir Tim Berners-Lee introduced the notion of linked data in which he outlined the best practices for creating and sharing data on the Web. To encourage people and government to share data, he recently developed the following rating system -
The highest rating is for the data that can link to other people’s data to provide context. While the Semantic Web has been growing steadily, there is lot of data that is still in raw format. A study by Google researchers shows that there are 154 million tables with high quality relational data on the world wide web. The US government along with 7 other nations have started sharing data publicly. Not all the data is RDF or confers with the best practices of publishing and sharing linked data.
Here in the Ebiquity Research Lab, we have been focusing on converting data in tables and spreadsheets into RDF; but our focus is not on generating just RDF, but rather generate high quality linked data (as now Berners-Lee calls it “5 star data”). Our goal is to build a completely automated framework for interpreting tables and generating linked data from it.
As part of our preliminary research, we have already developed a baseline framework which can link the table column headers to classes from ontologies in the linked data cloud datasets, link the table cells to entities in the linked data cloud and identify relations between table columns and map them to properties in the linked data cloud. You can read papers related to our preliminary research at [1]. We will use this blog as a medium to publish updates in our pursuit of creating “5-star” data for the Semantic Web.
If you are data publisher, go grab some Linked Data star badges at [2]. You can show your support to the open data movement by gettings t-shirts, mugs and bumper stickers from [3] ! (all profits go to W3C)
Happy Holidays ! Let 2011 be yet another step forward in the open data movement !
[1] – http://ebiquity.umbc.edu/person/html/Varish/Mulwad/?pub=on#pub