Trends and Transients 2017



Each year there are more new technologies to keep track of, more ways to organise your life and your company's information, more ways to communicate. This session will introduce you to new and potentially over-hyped technologies, discuss older, overlooked technologies, and entertain you at the same time. Our expert speakers will debate the current issues, giving you the benefit of their wide experience and differing points of view, so you can decide for yourself which technologies will meet your needs and which are a waste of your time.

This course is chaired by Lauren Wood and taught by Ann Wrightson, Irina Bolychevsky, Peter Krautzberger, and Steve Neale.

Classes for 2017

The Trends and Transients course runs on .

Mathematics on the web

Taught by Peter Krautzberger.

We will discuss current trends for handling mathematical content in XML workflows, with a primary focus on web production.

In particular, we will cover

  • an overview of common formats for authoring/storing math content
  • MathML and the role it plays in today's web
  • tools and techniques for rendering mathematics in a web context
  • enhancing accessibility of math content for the web
  • future directions for math on the web

Prior experience with MathML or other formats is not necessary but helpful.

Interoperability in healthcare standards

Taught by Ann Wrightson.

Over the last decade or so there has been strong & often noisy competition between standards for system-to- system communications in the Health sector. “The wonderful thing about standards is that there are so many of them” (variously attributed to Andrew Tanenbaum, Patricia Seybold, Grace Hopper & others) has become a byword ….and until quite recently the usual response from standards pundits, at least in public, was some variant of “Obviously, everyone should use this standard!”. Now there’s a different approach emerging out of eHealth collaboration in Europe, recognizing that competition between standards is unwinnable, wasteful and takes attention away from the key shared problem of safe interoperability.

In this session, you will learn about the latest thinking on interoperability at scale for clinical communications, how & why many standards end up competing in the same space – & how all that plays with XML, JSON & FHIR (the latest-generation HL7 standard).

XML, Related Formats, and Linked Open Data in Corpus Linguistics and Natural Language Processing

Taught by Steve Neale.

XML has a long association with the representation of linguistic data, being well-suited to both the structure and the unpredictability of language. It allows data — whether at the paragraph, sentence or token (word) level — to be organised in a well-formed, structured manner; at the same time, a range of syntactic and semantic features can be represented as attributes, seamlessly and flexibly used to describe structured elements where appropriate. In fields such as corpus linguistics and natural language processing (NLP), XML has been used to represent a range of corpora from large-scale endeavours such as the British National Corpus (BNC) to more focused contributions such as the semantically-annotated SemCor.

Corpora are generally built with linguistic principles in mind — designed according to appropriate and balanced reflections on demographics, contexts and genres — and with XML being well-suited to query and information retrieval, it is no surprise that it remains a popular format for corpus design. However, ontology-centric formats such as RDF and OWL — both of which can be represented as XML — are now widely used to deliver huge knowledge bases such as DBpedia as linked open data (LOD). These ontology-based formats — better suited to representing the specific relationships between entities than traditional XML schema — have driven fresh approaches for NLP that allow more semantically-oriented queries to be made on large-scale and often unstructured textual data.

The talk will begin with an overview of XML in the context of linguistic data, focusing on the qualities that make it ideal for representing structured information about language and how it is being used in a current project, CorCenCC — The National Corpus of Contemporary Welsh (Corpws Cenedlaethol Cymraeg Cyfoes). Next, focus will shift to RDF and OWL, and how they better represent the kinds of semantic relationships that are difficult to cater for with traditional XML schema. Finally, current trends in NLP will be explored, with an emphasis on how and where XML is being used in relation to linked open data.

Decentralising data silos and monopolies

Taught by Irina Bolychevsky.

The information age is transforming how we work, live and interact. We are all increasingly dependent on digital services to travel around, decide where to eat, learn, share info and maintain our public identity.

And yet, digital services are increasingly provided by a diminishing group of super monopolies, whose real customers are advertisers. Or else, services spring up, only to disappear or change once acquired, our data lost.

This session will be about the motivations behind the redecentralise movement, some projects leading the charge and what role openness (in data, code and standards) can play. We’ll talk about what models of decentralisation exist and which are relevant here and explore what parallels we can draw with the open data movement.