Semantic Technologies

Overview

According to a PriceWaterhouseCoopers report, “Semantic Web technologies could revolutionize enterprise decision making and information sharing”. By connecting more flexible, standardized ways to model and share data with best practices for identifying the meaning (or, at the very least, the source) of descriptive terms, Semantic Web technologies open up new possibilities for developing applications that work across the web or behind your firewall.

In this course, we’ll learn about the building blocks of the Semantic Web such as the RDF data model, the RDFa version that lets you embed machine-readable facts (or “triples”) into web pages, the SPARQL query language, and the Web Ontology Language (OWL) for defining vocabularies and term relationships. We’ll also learn about some of the open source and commercial software that lets you assemble these building blocks into applications that help you get more out of both your own data and the increasing amount of publicly available linked data, and see some examples of these technologies put into practise.

Faculty members Dr. Andy Seaborne, Leigh Dodds, Dr. Jeni Tennison, Alistair Miles, and Graham Klyne teach, as well as Faculty Board member Bob DuCharme.

Classes for 2012

The Semantic Web: an Overview

Taught by Bob DuCharme.

The Semantic Web is a set of standards and best practices for sharing data and the semantics of that data over the web for use by applications. What are the standards? What are the best practices? What does it mean to share semantics along with data, and how can that make the data more useful? How do applications use data from across the web?

In this class, we’ll look at the high-level answers to these questions, take a tour of the technology and the acronyms, and see how they all fit together before the day’s remaining speakers dig deeper into the practical use of these technologies.

Introduction to Linked Data

Taught by Bob DuCharme.

The infrastructure of the world wide web can do more than deliver documents for people to read off of their screens: it can also deliver data for applications to use. The principles of Linked Data have laid a foundation that has made it possible for governments, media, and e-commerce retailers to publish data on the web without depending on custom-built APIs. This class will show you how to take advantage of these principles to consume available data and to publish it yourself.

Among other things, we’ll learn about popular sets of linked data that you can use, how to create links between datasets, how to mint good URIs, how to publish data from relational databases, HTTP issues, and how to take gradual steps toward good linked data publishing.

Lunch break, day one

RDF Modeling: Getting started with RDFS, OWL, and SKOS

Taught by Leigh Dodds.

You’re starting to publish Linked Data or assemble another RDF-based application, and you’ve initially taken a pick and mix approach to selecting from existing vocabularies, but what happens when you need to go beyond popular vocabularies like FOAF, Dublin Core, or GoodRelations? How do you actually go about modelling data using RDF technologies? And how do semantic web schema languages differ from, say, XML schema?

This class will provide an overview of RDF modelling, including the use of RDFS and OWL to create custom vocabularies. The class will review how to make use of both technologies and provide guidance for going deeper. It will also look at how vocabularies like SKOS fit into the picture, and how it can used to help model a particular domain.

Linked Data Patterns

Taught by Leigh Dodds.

Design Patterns are a powerful approach for capturing and sharing knowledge among practitioners. Design patterns capture best practices in an accessible way that can help developers solve problems and learn new techniques faster. Looking at the current web of data it’s possible to identify a number of existing patterns that relate to the publishing and modelling of RDF and Linked Data.

This class will review a number of different design patterns that relate to: identifying resources, modelling data and publishing data to the web. Each of these patterns will address common challenges or questions faced by developers as they adopt Semantic Web technologies. Through discussion of some specific use cases the class will aim to show developers how the various technologies reviewed throughout the course can be used in practice.

End of day one

Introduction to SPARQL and SPARQL Update

Taught by Dr. Andy Seaborne.

SPARQL is the standard W3C query language for semantic web applications. It brings together the features of a number of RDF query languages into one method for extracting information from data represented in RDF, whether small datasets or large.

The next wave of SPARQL standardization is currently underway to add features that are useful for publishing data and also to add mechanisms to update and manage RDF data over the web.

This session will provide a solid grounding in SPARQL. After demonstrating how powerful some very simple SPARQL queries can be, we will take a practical approach to looking at the key features of SPARQL 1.0 and 1.1, and then explore the principles underpinning the SPARQL query language.

Following this, we will introduce the new standard features of SPARQL for update and management of data using web protocols. SPARQL Update is a language for modifying RDF data and SPARQL HTTP Update provides for RESTful update of a collection of RDF graphs.

Lunch break, day two

RDFa

Taught by Dr. Jeni Tennison.

This class will focus on the use of RDF that is embedded in XML, and especially in HTML web pages, with the W3C’s RDFa standard. We’ll look at how to create RDFa, how to add it to documents manually, and some strategies for automating the process. We’ll also see how to extract RDF triples from these documents, and we’ll tour some of the large-scale web sites that are currently making RDF data available on the public linked data web using RDFa. Along the way, we’ll address these common questions about RDFa: how is it similar to microformats, and how is it different? What kind of data is suitable for exposing to to other applications as RDFa? Is it only good for public web pages, or can it be useful behind a firewall? Can RDFS and OWL ontologies play a role in the use of RDFa?

Millenia of Metadata: Building the CLAROS website of classical art using semantic web technologies

This session will introduce the CLAROS system (http://explore.clarosnet.org/), its original goals to integrate classical art data from 4 major European research centres, and how we harnessed semantic web ideas to help realize those goals. The session will describe our initial technology choices, our experiences in deploying those technologies for a user-facing application, and will discuss some lessons about using Semantic Web technologies that have been drawn from those experiences.

Using SPARQL for Biological Data Integration – Reflections on openflydata.org and the FlyWeb Project

This talk will review the technology and work underpinning openflydata.org, a proof-of-concept site using RDF and SPARQL to query across data from several biological databases. There will be plenty of gory details, covering Javascript-SPARQL mashups, mapping big relational data to RDF, and dealing with hundreds of millions of triples.

A primer for the talk is the blog post at http://alimanfoo.wordpress.com/2011/01/17/using-sparql-for-biological-data-integration-reflections-on-openflydata-org-and-the-flyweb-project/.