Hybrid Publishing Tool: The Sausage Machine

Following up on the Hybrid Publishing Workflow, Gottfried Haider put together a small (but significant) piece of software for the PublishingLab, affectionately named The Sausage Machine.

You can try it here: http://hpt.publishinglab.nl/

See the source on github, including a Readme file

Just what is the Sausage Machine?

The Sausage Machine is an experimental system meant to facilitate hybrid text production. This means that you can upload your text files and choose multiple outputs for it whether that is ePub (for reading on your e-reader or mobile), icml (for InDesign print workflows) or html (for web). It builds upon the Hybrid Publishing Toolkit – an effort by a number of researchers and practitioners engaged in various forms of contemporary cross-media publishing.

The code is available under an Open Source license, and can be made to run on most webservers. If you’re running into any problems, feel free to open an issue on GitHub.

The Hybrid Publishing Toolkit made use of Markdown (as a markup language), Pandoc (as converter software), Makefiles (for specifying transformation rules) and Git (for distributed version control).

The Sausage Machine continues to use those tools, but instead of making it necessary to install and invoke them on every user’s machine, it is now possible to accomplish the most common tasks encountered in an editorial workflow using this web interface, further a client for Git, and a Markdown editor.

The invocation of Makefiles, and creation of output files using Pandoc is now most commonly done on the server – enabling consistent results and a more accessible way to start experimenting.

How does it work?

In the Import File tab you can drop existing files (.docx Word documents, images) to include in your book project. In the following tab, Start a book, you can choose between multiple base templates to start from. There, you can also edit the various (text) files making up your repository, and also export to various output formats on the spot. The output formats provided and the exact manner how this transformation is done is governed by the the very Makefile that comes with the respective base template. The button Continue on GitHub then creates your personal copy of the selected template as a repository on GitHub, and commits your initial changes done in the web interface as well. Future work, by you as well as other people working on the same project, can be done using a conventional Git workflow. Except that, after each commit getting pushed to GitHub, the Sausage Machine will automatically go to work and re-generate all of the repository’s output files. Those output files are again committed and pushed to GitHub, so that the user who pushed a change (e.g. a change to a Markdown file, or to the CSS stylesheet) can fetch an updated EPUB (and many other formats) just a few seconds later.

Aim

The aim was to prototype a way of working with publications, where text, code and design (rules) are standing a bit more on a par with each other, compared to conventional workflows. None of this is new – thinking e.g. about the electronic/experimental literature field of the past, or about Alan Kay’s Dynabook, where the software also ought to a malleable matter, rather than a hard-wired construct, thanks to Smalltalk and the concept of “late binding”.

Precursor

The more immediate precursor is the Hybrid Publishing Toolkit, which uses Makefiles – those archaic text files that are generally used to describe how software is being compiled – to transform text to a range of different output formats, such as EPUB or PDF files.

To make this more accessible to a wider group of people, who are perhaps no fan (yet) of the terminal, Gottfried came up with the following “Continuous Integration” system for text: whenever a change is made to the Git repository, software on a server fetches the repository, executes the Makefile (and other custom scripts) that it contains, and pushes all modified output files right back into the repository.

A change, that could be: modifications to the (literal) text, tweaking of style and graphics, or changes to the code itself; all of which now can happen more easily in intertwined and concurrent ways.

The initial motivation is certainly to translate text into various output formats for screen and print (and this appears to be hard enough), but it could just as well to be used to break down the barriers between the different parts: e.g. for co-writing a book together with an algorithm, or for self-modifying code that manifests itself in text, etc. [a nice arc to Matthias Doerfelt’s ongoing itsdoing.it exhibition!]

What next?

The skeleton of the project is there, but there are still things to continue with. Gottfried has listed the caveats.

Credits for Hybrid Publishing Toolkit Research

Marc de Bruijn, Liz Castro, Florian Cramer, Joost Kircz, Silvio Lorusso, Michael Murtaugh, Pia Pol, Miriam Rasch and Margreet Riphagen; as well as later work by Andre Castro.