The Wien[n]erisches Diarium Digital – Digitarium is an ongoing project that aims at making available one of the oldest newspapers in the world available as a high quality full text. The Wiennerisches Diarium, now Wiener Zeitung, first appeared on 8th August 1703, initially with 2 issues per week of usually 8–12 pages each but reaching over 40 pages regularly by the second half of the 18th century. From October 1813, there were daily issues (including Sundays).
The project makes use of the Transkribus software and Handwritten Text Recognition (HTR) models trained specifically on the newspaper’s issues to achieve a reasonably high quality full text – on average, less than 1.5% character error rate (CER) – from automated processing. During the first 2 years, the project implemented this automated workflow and improved the HTR models by making available 420 issues from 1703 – 1799 (5 per year where images were already available). In the end, the team wants to include all issues from 1703 until the 1940es in an extensive corpus with a versatile frontend that can cater to different research needs.
While a recent grant application (the outline of which had been presented in Tokyo) has not been successful, the project team still is intent on developing an interface together with researchers and the interested public alike. Several questions that arise from the serial nature of the source, the amounts of text involved as well as the linguistic changes over more than 200 years have to be addressed and combined with a user centred design approach so that the texts can be presented, read, searched and otherwise reused easily.
This one-day workshop wants to include the TEI community in this development. The first part will introduce participants to the project, its workflow and the current web interface. A short survey is to collect the initial reactions to the interface.
The second part will focus on research questions that can be answered by periodical texts and how both the framework in which they are presented and the encoding of the texts and their metadata can support a wide variety of research disciplines. Participants will be asked to try to answer a research question from a field of their choosing and record what steps they take, what functionality they would like to see included in the web frontend to help them in their research and whether, and if so, how, they would like to contribute in improving the quality of the source material.
Both parts will be connected by the presentation of (and comparison to) the results of a two day conference and several “annotate-a-thons” held from 2017 – 2019.
The results of the workshop will be discussed in the form of an article for the JTEI while also being an important basis for the further development of the framework used to present the Diarium’s texts (which of course is available as an open source project).
Participant should have a laptop with a working internet connection so they can use the projects website. The room needs a means of projection.
While the Diarium texts are in German, knowledge of German is not necessary. However, if participants want to suggest a periodical in a different language for use in the project, this is highly welcome and will be included for use during the workshop.
Dario Kampkaspar holds Magister degrees in history and English philology from the University of Heidelberg. During his years at the Institut für Personengeschichte he increased his knowledge of old and rare books as well as prosopography. His next post was the Herzog August Bibliothek in Wolfenbüttel where he was responsible for several DH projects, among these DARIAH-DE and the critical edition of Andreas Bodenstein von Karlstadt.
At the Austrian Academy of Sciences, Austrian Centre for Digital Humanities, he is currently part of the Wien[n]erisches Diarium project team and of several smaller projects.
His research interests are medieval and early modern history – his Ph.D. thesis, currently in statu nascendi, is about a 1610 manuscript and centers on prosopography and the history of science –, English and Scottish history and historical linguistics. Among other projects, he is part of the DLiNA group researching networks in dramatic texts.