The iDAI.publications from open digital publishing to text mining
Posted on 05 February 2018
Talk: Francesco Mambrini (DAI), The iDAI.publications from open digital publishing to text mining
Date: Monday, 5 February 2018
Time: starting at 17:00 c.t. (i.e. 17:15)
Venue: DAI, Wiegandhaus, Podbielskiallee 69-71, D-14195 Berlin (map)
The iDAI.publications is the new Open Access publication platform of the Deutsches Archälogisches Institut (DAI) that aims to make the unique collection of scholarship published by the institute accessible to the public. At the same time, we intend to leverage our unique collection of scholarly articles and monographs to extract meaningful information, such as persons, places, canonical citations of ancient texts and mentions to archaeological artifacts. In this seminar we will present the two components of the iDAI.publications, the publication environment based on the Open Journal System and Open Monograph Press, and a series of Natural Language Processing (NLP) tools that we programmed to extract and structure the relevant information from the texts. We discuss the chief problems related to Open Access publishing present, and we present the current NLP pipeline, the evaluation of its components and the strategies adopted to improve the accuracy of our results.