Linked Open Data dictionaries and palaeography

Active
Linked Open Data dictionaries and palaeography Shirley Sidharta, CC BY SA 4.0

As part of the Haft Tappeh project, a digital edition of cuneiform texts is being created that is fully compatible with Linked Open Data. In order to ensure interoperability with the (Linguistic)Linked Open Data Cloud, annotations of semantic aspects and linguistic annotations of text passages are required as central elements. However, these prerequisites were lacking for cuneiform characters and languages, which is why they first had to be developed cooperatively. The project is therefore concerned with the digitisation of existing analogue dictionary resources and works of palaeography using freely available script and character fonts in Wikidata. The creation of these resources is a prerequisite for the interoperable annotation of cuneiform texts as well as for the findability and networking of digital cuneiform resources beyond the project context.

Motivation

After the initial creation of around 4500 Sumerian lexemes from the first dictionary resources in 2022, around 5000 Akkadian lexemes were added to Wikidata in 2023. At the same time, the number of Sumerian lexemes was increased to almost 6000 entries. Based on a paper presented at the Graphematics Conference in Paris in 2022, a data model for paleographic cuneiform data in Wikidata was developed and implemented using six different cuneiform fonts.

The tool PaleOrdia, a static website that visualises the content of Wikidata lexemes, was developed for the indexing of paleography and dictionary data. As part of a cooperation in the DANES network, the implementation of a semantic dictionary for the Elamite language has also been underway since autumn 2023. Adam Anderson and students at UC Berkeley are contributing further components as part of the Data Science Discovery Programme, such as grammatically excellent word forms with textual evidence, as well as the transfer of two cuneiform databases to the linked open data repository FactGrid.

Activities

The dictionary projects for Sumerian, Akkadian and Elamite are due to be completed in 2024 and the linked open data cloud generated in this way will be linked to the results of the Haft Tappeh Edition. The aim is also to use tools to make the Linked Open Data resources more accessible to the general public. The integration of the annotations into the Cuneiform Annotator and a browser extension for looking up word meanings are already in progress.