Intelligent data acquisition, storage and provision within the public administration

Finished
Linked Open Data as a basis for the automated, intelligent creation of maps Vanessa Liebler for the i3mainz, CC BY SA 4.0

The aim of the project is to establish a Linked Data infrastructure at the Federal Agency for Cartography and Geodesy on the basis of some selected data sets and to integrate them. Ontologies for data standards are to be standards and to test best practices for semantic integration in practice. semantic integration in practice.

Motivation

The Federal Agency for Cartography and Geodesy (BKG) is the central authority for the provision of geodata in the Federal Republic of Germany. Currently are the typical OGC formats (Open Geospatial Consortium) like GML (Geography Markup Language) or also shapefiles, either as direct downloads or as Web services (WMS, WFS), as Open Data or paid products, respectively. Furthermore, a process of standardization of geodata is for example by the European Commission with the INSPIRE Initiative and will be implemented within the next years. implemented within the next few years. This wants a harmonization of the European data formats on a syntactic level. The trend - not only in public administration, but also in various other communities - is moving is to make geodata available as LinkedData under the principles of the 5-Star Open Data Model.

The goal of the project is therefore to support the BKG in setting up a Linked Data infrastructure and to develop the integration of diverse data with other Linked Data repositories. Based on use cases with specific data managed by the BKG, a semantic integration of geodata is to be carried out as an example. In doing so, the reconvertibility into the respective source formats should be guaranteed and the results of the conversion should be visualized on a map. Possible enrichments of the integrated geodata are to be checked and, marked accordingly, also made available to the end user. This should enable the BKG to develop an integration platform for the provision of Linked Data on the one hand and to demonstrate the advantages of semantic integration on the other hand by means of example maps/example services.

Activities

In 2019, after suitable data for semantic integration has been selected by the State agencies and the Federal agency were selected, these were analyzed in the Spring 2020, procedures for their integration were developed, and the integration was carried out. Ontologies, which were selected for further use from XML schemas for further use were submitted to the architecture working group for evaluation with the with the goal of making them available in the GDI-DE as a standard for the different various state authorities as a standard in the future. 

 The data was integrated into a Triple Store provided at the BKG, which will be used to publish all data converted into linked data at the BKG. For the provision of geodata, there are no connections to triple stores in conventional software so far. This will probably not change in the foreseeable future due to the differences between the technologies. Therefore, it was a concern of the project to develop appropriate software for this purpose. 

 After integrating the aforementioned datasets into the project and making them available, it was decided to extend the project with work on collecting the geodata times, managing their metadata, enriching and correcting the datasets, improving the user interface, and testing the new software infrastructure by external users in the form of a hackathon.

 The second phase of the project began in September 2020 and will end in August 2021. The initial focus was on the integration and semantic management of metadata, as well as the management and storage of schemas used to export RDF data to geospatial formats. Metadata management was addressed by developing a web service based on the recommendations of the OGC API Records and a web interface that allows access to this service and manipulation of the metadata. The interface was then extended to store data schemas and semantically link them to the representation of the associated data. Finally, work began on integrating time in the context of geospatial data, with the extension of the ontological model to include the concepts of time and versions. This work on spatio-temporal data will continue in 2021 to develop a web interface to allow visualization and comparison of this data.

Results

Result of the first project phase (Fig. 2):

  • The SemanticWFS: a Java web application that enables content from triple stores to be made available as FeatureCollections. This application uses the OGC defined OGC API Features interface and the OGC predefined Web Feature Service standard. This allows the BKG to define its own FeatureCollections on Linked Data and make it available so that it can be displayed and processed by traditional GIS software. Additionally, the web application provides a standardized web interface for the web browser.
  • GeoPubby: a Linked Data frontend for displaying instances in the BKG’s Linked Data Graph. The Linked Data frontend was extended from a previous project Pubby and provides export capabilities for more than 15 geospatial data formats, as well as the ability to download instances in various coordinate reference systems.
  • SPARQLUnicorn QGIS Plugin: a QGIS plugin to run SPARQL queries on Linked Data graphs, enrich geodata layers and convert data to RDF. The plugin helps users create queries, shows them existing geo-related concepts, and can prepare data for integration into a Linked Data repository.
  • SemanticImporter: an importer tool for geospatial data into the BKG’s Triple Store. The tool, which is equipped with a rudimentary web interface, allows uploading geodata, defining mappings to Linked Data vocabularies, and saving these mappings. The mappings - if exported - can also be used in the SPARQLUnicorn QGIS plugin to prepare geodata for Linked Data. Importer also enriches Provenance information and other vocabularies required for metadata description so that they can be considered by the previously mentioned tools.

Result of the second phase of the project: 

  • GeoTime Web Service: a Spring-based Java web application that provides access to and manipulation of information related to the spatio-temporal data contained in the Triple Stores. This application is used to provide several services. The first available service follows the recommendations of the “OGC API - Records - Part 1: Core”, which is currently being standardized to provide a more modern service, than the catalog web services. This is based on the current web architecture and best practices for geospatial data on the web. This allows the BKG to manage the metadata describing the FeatureCollections provided by SemanticWFS and make it available for viewing and editing by traditional GIS software. The second service enables the storage and use of schemas associated with FeatureCollections. Finally, the third service, currently under development, aims to manipulate spatio-temporal data.

 In addition, the web application provides a standardized interface for the web browser:

  • GeoTime Frontend: This web interface allows users to import, view, edit, and export metadata, as well as link to FeatureCollections. This frontend also provides an interface to implement the OGC API Records. It allows you to import and save schemas after identifying them in GitLab. Finally, it provides the ability to verify that the export of geodata respects the associated schema.