XML in Publishing 2017



XML can help publishers tackle managerial as well as technical challenges. It provides ways to manage the workflow, the interaction between content and people, and the publishing processes, as well as the documents themselves. The features of XML ensure that information and its structure can be controlled and managed.

This course presents a range of XML techniques and applications in workflow, change management, QA, linked data, and document structure control to help publishers manage their content effectively.

The Hands-on Digital Publishing course provides hands-on material that complements this course.

This course is chaired by Peter Flynn and taught by Norm Walsh, Nic Gibson, Tony Graham, and Tomos Hillman.

Classes for 2017

The XML in Publishing course runs on and .

Introduction to XML in publishing

Taught by Tomos Hillman.

This session addresses the impact of technology on publishing, exploring trends of abstraction, separation of concerns, and profitability. We go on to discuss the strengths and weaknesses of XML in publishing, and explore what this should mean for how we plan our content and workflows.

Capturing XML Content

Taught by Tomos Hillman.

Starting from the general principles established in the introduction, this session compares approaches on capturing the XML content. We’ll take some time to look at quality control, discussing the benefits of technologies like schema and schematron, as well as considering documentation needs. As well as discussing challenges working with external type-setters and capturers, we’ll look at some of the possibilities and pit-falls of authoring directly in XML.

Underlying Technologies

Taught by Norm Walsh.

In this section of the course, we’ll turn our attention to the technology choices available: schema languages, validation technologies, and processing tools. We’ll consider vocabulary concepts: What makes a good schema? Should you build your own or use an existing standard? How do JATS, DocBook, DITA, etc. compare? How can you tell what’s right for your organization? What processing tools are available and how can you leverage them? Should your workflow include Markdown or other non-XML structured markup langauges? How can you leverage linked data in your publishing workflow? We’ll leave time for questions and discussion of the particular challenges facing our delegates.

This session starts after lunch and continues after the break.

Testing Times

Taught by Tony Graham.

While every document is different, the point of having a workflow is being able to apply the same processes to them, with automated validation, testing, and processing. In this part, we will look at some technologies and tools for testing (almost) all aspects of your XML in publishing. Topics covered include: XML validation technologies, including Schematron; testing XSLT, XQuery, and Schematron using XSpec (presentation developed by Sandro Cirullo, XSpec maintainer, and presented by Tony); regression testing paged media; and automating your testing processes.

Making 'Pages'

Taught by Tony Graham.

Anyone in publishing is eventually going to have to make pages; XSL-FO and CSS are among the obvious means. This part gives a review of some of the options available to you for producing traditional paged media or book-like output, and how it can be managed alongside non-paged editions of the documents. The presentation includes a review of the current state of XSL-FO, CSS, EPUB, and PWP. Together, we'll look at your pain points and apply the collective 100+ years of publishing and XML experience in the room.

Document models: structure and semantics

Taught by Nic Gibson.

Differing XML models provide differing semantic models. Publishers' content will match some models better than others. We will examine the semantic depth of common models such as JATS, DocBook, and (X)HTML and look at how differing content can be modelled with XML. We will look at lessons we have learned as XML based publishing has become part of the mainstream of the publishing industry. We will at successful XML implementations and at consider mistakes that have been made (and how we can avoid them). We’ll particularly consider the idea that every publisher needs their own schema and why this is almost always a mistake. We will consider how metadata can be used for bibliographic and for marketing purposes and how metadata standards can be used to improve the quality of content when we are publishing to multiple output channels.

This session starts after lunch and continues after the break.