Trends and Transients 2019

 

Overview

Each year there are more new technologies to keep track of, more ways to organise your life and your company’s information, more ways to communicate. This session will introduce you to new and potentially over-hyped technologies, discuss older, overlooked technologies, and entertain you at the same time. Our expert speakers will present and debate current issues, giving you the benefit of their wide experience and differing points of view, so you can decide for yourself which technologies will meet your needs and which are a waste of your time.

This course is chaired by Dr Peter Flynn and taught by Dr Peter Murray-Rust, Dr Cigdem Sengul, Dr David Shotton, and Graham Klyne.

Classes for 2019

The Trends and Transients course runs on .

Open Scholarship and Open Citations – present problems and future feasibilities

Taught by David Shotton.

In this presentation, I will shift the focus from the detailed implementation of XML to consider the benefits that flow from making bibliographic metadata available in machine-readable form. Specifically, I will discuss this in relation to scholarly citations, and the work of OpenCitations, a small independent scholarly infrastructure organization dedicated to open scholarship and the publication of open bibliographic and citation data by the use of Semantic Web (Linked Data) technologies.

Following a discussion of the stages and benefits of Open Scholarship, I will discuss the present transitional state of academic publishing. I will then compare the semantics of XML and RDF, the ‘language’ of the Semantic Web, which I will introduce using a simple example, and will then describe the SPAR (Semantic Publishing and Referencing) Ontologies that can be used to describe all aspects of the scholarly publishing domain.

I will then discuss the advantages of treating bibliographic citations as first-class data entities, the Open Citation Identifiers that can be used to identify open citations uniquely, and the OpenCitations Indexes we are building to enable them to be searched and downloaded.

Having mentioned the current collaborators and users of OpenCitations data and services, I will outline our plans for the radical expansion of OpenCitations required to enable us to provide a genuine open alternative to the current monopolistic position of the two main commercial citation indexes, Web of Science and Scopus.

I will conclude with a discussion of the requirements for sustainability of open infrastructure organizations such as OpenCitations, and the various financial models that might provide the funds to enable such sustainability.

Key references:

Silvio Peroni, David Shotton (2019) OpenCitations. https://arxiv.org/abs/1906.11964

Silvio Peroni and David Shotton (2018) Open Citation: Definition. Figshare. https://figshare.com/articles/Open_Citation_Definition/6683855

Silvio Peroni and David Shotton (2018) Open Citation Identifier: Definition. Figshare. https://figshare.com/articles/Open_Citation_Identifier_Definition/7127816

David Shotton (2018). Funders should mandate open citations. Nature 553: 129. https://doi.org/10.1038/d41586-018-00104-7

Silvio Peroni and David Shotton (2018). The SPAR Ontologies. In: Vrandečić D. et al. (eds) The Semantic Web – ISWC 2018. ISWC 2018. Lecture Notes in Computer Science, vol 11137. Springer, Cham. https://doi.org/10.1007/978-3-030-00668-6_8

Silvio Peroni, David Shotton, Fabio Vitali (2017). One Year of the OpenCitations Corpus: Releasing RDF-based scholarly citation data into the Public Domain. In The Semantic Web – ISWC 2017 (Lecture Notes in Computer Science Vol. 10588, pp. 184–192). Springer, Cham. https://doi.org/10.1007/978-3-319-68204-4_19

Silvio Peroni, Alexander Dutton, Tanya Gray, David Shotton (2015). Setting our bibliographic references free: towards open citation data. Journal of Documentation,71 (2): 253- 277.  https://doi.org/10.1108/JD-12-2013-0166, OA at http://speroni.web.cs.unibo.it/publications/peroni-2015-setting-bibliographic-references.pdf

David Shotton (2013). Open citations. Nature, 502 (7471): 295-297. https://doi.org/10.1038/502295a

Copyright, XML, and the value of markup

Taught by Peter Murray-Rust.

In 1997–8 while helping to develop XML, I saw it as an opportunity to liberate thought and communication. In this spirit, Henry Rzepa and I developed Chemical Markup Language (CML), which has evolved to being a fluid natural language of objects, rather than a centrally-controlled DTD or schema.

However, XML in science publishing has become centralist and arcane. JATS does not support authors — it removes creativity. Publishers who used to expose XML now hide it, so the 20-year-old dream of reusable XML reinterpreted in the browser is currently on hold.

But XML is a symptom, not the cause.

“Publishing“ dominates science and constrains the research that people do. The big publishers want to control how science is communicated, with a sacred “version of record” in an unalterable PDF. But that’s not how creative scientists think — Perelman communicated his Poincaré proof solely through arXiv. It is the content, not the container, than matters.

Big science publishers are also expensive and unregulated. The cost of a preprint on arXiv and other preprint repositories is about 10 USD: costs are minimal because authors use Word or LaTeX to create semantic documents before submission.

XML was meant to remove the friction of rekeying and typesetting, but it has probably made it worse: the cost of a processed manuscript should be no more than 250 USD but the current prices to publish chemistry articles in “high impact” journals average 2,500 USD. The process disenfranchises authors and readers (should blind readers have to read two-column PDF text and bitmaps?) and actively prevents them from downloading sufficient material (10,000+ articles) to do systematic reviews.

Many scientists can no longer publish in that way: it’s limited to the rich west (universities) and has become an instrument of neo-colonialism. But publishing *can* be inclusive, as the LatAm countries have demonstrated, most recently through the AmeliCA initiative, which is JATS-XML for Latin America and the Global South, defeating the subordination of the global conversation of science.

Can XML once again become an instrument for innovation and democratisation? I hope to be informed by talking with delegates, and I shall give interactive demonstrations of what XML can do if the political will is there. I shall present from my own machine and make slides available immediately.

Authorisation in the Internet of Things

Taught by Cigdem Sengul.

While authentication and authorisation are basic security requirements, implementing in IoT (Internet of Things) environments may be a challenge. OAuth 2.0 is a standardized authorisation framework that allows the user to participate in granting permissions to applications seeking user data, which enables meaningful privacy control. Embracing the OAuth 2.0 framework, there are several standardisation working groups extending and innovating with OAuth2.0 to address the diverse challenges of IoT environments.

The talk will be composed of four parts:

  1. A brief introduction to IoT and security challenges
  2. A brief overview of OAuth2
  3. Revision of ongoing work on standardisation for OAuth2-based IoT standards
    • OAuth2 Device Authorisation Grant: Designed for internet-connected devices that either lack a browser or are input-constrained to the extent that requiring users to input text to authenticate is impractical.
    • UMA (User-Managed Access): Designed to enable a resource owner to control protected resource access by requesting parties in an asynchronous fashion.
    • ACE (Authentication and Authorisation for Constrained Environments): Designed for IoT environments based on a set of building blocks including OAuth2.0 and CoAP, and thus making a well-known and widely-used authorization solution suitable for IoT devices.
  4. Open challenges
Linked Data in Digital Humanities

Taught by Graham Klyne.

The field of Digital Humanities explores the ways in which digital technologies can be used to support and enhance humanities research. This session will be about experiences of using linked data (RDF) and associated Semantic Web technologies in Digital Humanities applications.

I shall start by discussing some ways that application of Semantic Web technologies in humanities may differ from use of the same in scientific or other common applications. I shall then discuss applications from projects I’ve been working on recently, showing how affordances of linked data can support some particular requirements of humanities data:

1. MELD: Music Encoding and Linked Data (Fusing Audio and Semantic technology (FAST) project). The FAST project is exploring the intersection of semantic and music-related technologies. I shall talk about the MELD framework, which uses linked data in applications that establish and act upon connections between musical structure, music-related media and other data.

2. EM Places: Early Modern place information (Cultures of Knowledge (CofK) project). The EM Places project is building an online reference resource of place information from Early Modern Letters Online (EMLO). I shall discuss the development of a data model that is required to capture historical contextual information, and also to incorporate up-to-date information from other sources of place information.

The session will focus on digital humanities applications, but some of the experiences may be more broadly relevant. Technically, the session will focus on data modelling issues rather than specific details of linked data formats or storage systems.