Semantic Technologies

 

Over­view

Accord­ing to a Price­Wa­ter­house­Coopers report, “Semantic Web tech­no­lo­gies could revo­lu­tion­ize enter­prise decision mak­ing and inform­a­tion shar­ing”. By con­nect­ing more flex­ible, stand­ard­ized ways to model and share data with best prac­tices for identi­fy­ing the mean­ing (or, at the very least, the source) of descript­ive terms, Semantic Web tech­no­lo­gies open up new pos­sib­il­it­ies for devel­op­ing applic­a­tions that work across the web or behind your firewall.

In this course, we’ll learn about the build­ing blocks of the Semantic Web such as the RDF data model, the RDFa ver­sion that lets you embed machine-readable facts (or “triples”) into web pages, the SPARQL query lan­guage, and the Web Onto­logy Lan­guage (OWL) for defin­ing vocab­u­lar­ies and term rela­tion­ships. We’ll also learn about some of the open source and com­mer­cial soft­ware that lets you assemble these build­ing blocks into applic­a­tions that help you get more out of both your own data and the increas­ing amount of pub­licly avail­able linked data, and see some examples of these tech­no­lo­gies put into practise.

Fac­ulty mem­bers Dr. Andy Seaborne, Leigh Dodds, Dr. Jeni Ten­nison, Alistair Miles, and Gra­ham Klyne teach, as well as Fac­ulty Board mem­ber Bob DuCh­arme.

Classes for 2012

The Semantic Web: an Overview

Taught by Bob DuCh­arme.

The Semantic Web is a set of stand­ards and best prac­tices for shar­ing data and the semantics of that data over the web for use by applic­a­tions. What are the stand­ards? What are the best prac­tices? What does it mean to share semantics along with data, and how can that make the data more use­ful? How do applic­a­tions use data from across the web?

In this class, we’ll look at the high-level answers to these ques­tions, take a tour of the tech­no­logy and the acronyms, and see how they all fit together before the day’s remain­ing speak­ers dig deeper into the prac­tical use of these technologies.

Intro­duc­tion to Linked Data

Taught by Bob DuCh­arme.

The infra­struc­ture of the world wide web can do more than deliver doc­u­ments for people to read off of their screens: it can also deliver data for applic­a­tions to use. The prin­ciples of Linked Data have laid a found­a­tion that has made it pos­sible for gov­ern­ments, media, and e-commerce retail­ers to pub­lish data on the web without depend­ing on custom-built APIs. This class will show you how to take advant­age of these prin­ciples to con­sume avail­able data and to pub­lish it yourself.

Among other things, we’ll learn about pop­u­lar sets of linked data that you can use, how to cre­ate links between data­sets, how to mint good URIs, how to pub­lish data from rela­tional data­bases, HTTP issues, and how to take gradual steps toward good linked data publishing.

Lunch break, day one

 

RDF Mod­el­ing: Get­ting star­ted with RDFS, OWL, and SKOS

Taught by Leigh Dodds.

You’re start­ing to pub­lish Linked Data or assemble another RDF-based applic­a­tion, and you’ve ini­tially taken a pick and mix approach to select­ing from exist­ing vocab­u­lar­ies, but what hap­pens when you need to go bey­ond pop­u­lar vocab­u­lar­ies like FOAF, Dub­lin Core, or Goo­dRe­la­tions? How do you actu­ally go about mod­el­ling data using RDF tech­no­lo­gies? And how do semantic web schema lan­guages dif­fer from, say, XML schema?

This class will provide an over­view of RDF mod­el­ling, includ­ing the use of RDFS and OWL to cre­ate cus­tom vocab­u­lar­ies. The class will review how to make use of both tech­no­lo­gies and provide guid­ance for going deeper. It will also look at how vocab­u­lar­ies like SKOS fit into the pic­ture, and how it can used to help model a par­tic­u­lar domain.

Linked Data Patterns

Taught by Leigh Dodds.

Design Pat­terns are a power­ful approach for cap­tur­ing and shar­ing know­ledge among prac­ti­tion­ers. Design pat­terns cap­ture best prac­tices in an access­ible way that can help developers solve prob­lems and learn new tech­niques faster. Look­ing at the cur­rent web of data it’s pos­sible to identify a num­ber of exist­ing pat­terns that relate to the pub­lish­ing and mod­el­ling of RDF and Linked Data.

This class will review a num­ber of dif­fer­ent design pat­terns that relate to: identi­fy­ing resources, mod­el­ling data and pub­lish­ing data to the web. Each of these pat­terns will address com­mon chal­lenges or ques­tions faced by developers as they adopt Semantic Web tech­no­lo­gies. Through dis­cus­sion of some spe­cific use cases the class will aim to show developers how the vari­ous tech­no­lo­gies reviewed through­out the course can be used in practice.

End of day one

 

Intro­duc­tion to SPARQL and SPARQL Update

Taught by Dr. Andy Seaborne.

SPARQL is the stand­ard W3C query lan­guage for semantic web applic­a­tions. It brings together the fea­tures of a num­ber of RDF query lan­guages into one method for extract­ing inform­a­tion from data rep­res­en­ted in RDF, whether small data­sets or large.

The next wave of SPARQL stand­ard­iz­a­tion is cur­rently under­way to add fea­tures that are use­ful for pub­lish­ing data and also to add mech­an­isms to update and man­age RDF data over the web.

This ses­sion will provide a solid ground­ing in SPARQL. After demon­strat­ing how power­ful some very simple SPARQL quer­ies can be, we will take a prac­tical approach to look­ing at the key fea­tures of SPARQL 1.0 and 1.1, and then explore the prin­ciples under­pin­ning the SPARQL query language.

Fol­low­ing this, we will intro­duce the new stand­ard fea­tures of SPARQL for update and man­age­ment of data using web pro­to­cols. SPARQL Update is a lan­guage for modi­fy­ing RDF data and SPARQL HTTP Update provides for REST­ful update of a col­lec­tion of RDF graphs.

Lunch break, day two

 

RDFa

Taught by Dr. Jeni Ten­nison.

This class will focus on the use of RDF that is embed­ded in XML, and espe­cially in HTML web pages, with the W3C’s RDFa stand­ard. We’ll look at how to cre­ate RDFa, how to add it to doc­u­ments manu­ally, and some strategies for auto­mat­ing the pro­cess. We’ll also see how to extract RDF triples from these doc­u­ments, and we’ll tour some of the large-scale web sites that are cur­rently mak­ing RDF data avail­able on the pub­lic linked data web using RDFa. Along the way, we’ll address these com­mon ques­tions about RDFa: how is it sim­ilar to micro­formats, and how is it dif­fer­ent? What kind of data is suit­able for expos­ing to to other applic­a­tions as RDFa? Is it only good for pub­lic web pages, or can it be use­ful behind a fire­wall? Can RDFS and OWL onto­lo­gies play a role in the use of RDFa?

Mil­lenia of Metadata: Build­ing the CLAROS web­site of clas­sical art using semantic web technologies

This ses­sion will intro­duce the CLAROS sys­tem (http://explore.clarosnet.org/), its ori­ginal goals to integ­rate clas­sical art data from 4 major European research centres, and how we har­nessed semantic web ideas to help real­ize those goals. The ses­sion will describe our ini­tial tech­no­logy choices, our exper­i­ences in deploy­ing those tech­no­lo­gies for a user-facing applic­a­tion, and will dis­cuss some les­sons about using Semantic Web tech­no­lo­gies that have been drawn from those experiences.

Using SPARQL for Bio­lo­gical Data Integ­ra­tion – Reflec­tions on openflydata.org and the Fly­Web Project

This talk will review the tech­no­logy and work under­pin­ning openflydata.org, a proof-of-concept site using RDF and SPARQL to query across data from sev­eral bio­lo­gical data­bases. There will be plenty of gory details, cov­er­ing Javascript-SPARQL mashups, map­ping big rela­tional data to RDF, and deal­ing with hun­dreds of mil­lions of triples.

A primer for the talk is the blog post at http://alimanfoo.wordpress.com/2011/01/17/using-sparql-for-biological-data-integration-reflections-on-openflydata-org-and-the-flyweb-project/.