The activities in the year 2018 were motivated by providing the first version of SemGIS prototype to support our partner Geomer into the integration of data from different sources and the enrichment of these data through the Linked Open Data.
While the integration of such data is of tremendous importance in itself, one aspect of integration that is often neglected is the quality of the integrated heterogeneous data. Data quality metrics can help to indicate whether the integrated semantic data is suitable for a variety of tasks the end-user plans to execute. Quality-assured data in the scientific literature is defined as data that fulfills the requirements set up by the end-user. However, the purposes of data can be considered multidimensional as data can be used in many different ways and contexts. It is therefore important to create a use-case dependent data quality evaluation framework to evaluate the integrated semantic data to give indications about the situation dependent usefulness of map data for the end-user with the support of semantic technologies. In addition to a use case dependent evaluation of map data, a temporal perspective is useful to be taken into account. If the changes of map data could be predicted using a customized algorithm based on previous map behavior, an indicator on future map quality could be given, therefore increasing the reliability of the quality score significantly. All in all, the possibilities of not only integrated but also quality assured integrated data should be explored through the PhD thesis of Timo Homburg as a subpart of SemGIS project.
In the context of application case in disaster management, the study of existing systems and needs of disaster management has allowed for identifying a limit at the level of the action plans and guidelines prepared to respond to a disaster. They are assessed after their application to a disaster or after training of response actors to a disaster situation. However, some tests of the plans can be inapplicable or have a high cost. That is why we want to provide a tool which could allow testing action plan with a lower cost. The study of existing work has allowed for determining such a tool is provided through a multi-agent system to simulate the decision-making of responders based on prepared plans and guidelines according to a disaster situation. The multi-agent simulation offers an overview of the consequence of this decision-making which allows for assessing the prepared plans and guidelines. However, existing multi-agent models are designed according to a specific use case and the observations expected from simulation experiments. Existing models are thus, hardly adaptable to different plans simulation and different experiments. That is why the Ph. D thesis of Claire Prudhomme propose a process of automatic multi-agent system modeling from the information of disaster management contained into the semantic information system, to support the disaster management community in the preparation of action plans.
The activities in the year 2018 can be summarized by developing application cases for several cooperation partners while developing essential tools for data integration and quality assurance. Further prototypes for data integration of data with predefined (and extracted) ontologies have been developed in the project from spring to summer. GMLImporter allows the automated conversion of relational geospatial data with a predefined ontology. In an internship at project partner Geomer, another prototype has been developed, allowing to produce mapping schemas to map dataset columns and their respective values not only to predefined classes but also to values that can be retrieved via SPARQL queries from the Semantic web. These and other features of the prototype offer more general means to convert geospatial data to RDF than other previous approaches. In addition data format specific ontologies were further developed and lead to cooperation with the Federal Agency for Cartography and Geodesy, which SemGIS supports in setting up a national linked data infrastructure. In addition, an exchange between GeoNet MRN and the company EFTAS lead to the integration and support of the XErleben format for points of interest and a possible new project proposal in the context of tourism.
Data quality assessment:
From January to March of this year the work of this thesis focused on the publication of a poster at the LBS conference in Zurich. The poster highlights the idea of creating a classifier for map change prediction using which areas of a map which are possibly subject to change can be identified. The classifier can be used as a means to measure uncertainty in map data which might in turn be useful for many application cases relying on map data. In March 2018, the concept of data quality assessment has been extended to include requirement profiles of application cases which can be expressed using an ontology and can be deduced through reasoning. It is therefore possible to define an application case such as a firebrigade rescue mission semantically, link this case to map requirements and check those requirements using precalculated data quality metrics results. The impact of said data quality metrics on the feasibility of the given task can be assessed on a map object level as well as through aggregation on a higher level using Semantic reasoning giving end users the possibility to examine which parts of a map can be used to conduct a particular operation. This concept was presented in July in Berlin at the BIS Quality Of Data Workshop and subsequently lead into a publication thereof in the conference proceeding book. Meanwhile, work on a journal publication had been underway to extend the WebIST publication of 2017. Here, the WebIST paper of 2017 was extended by the idea of introducing data quality parameters into the automated extraction method and by the idea of thematic clusters leading to a prioritization of special content in the to be imported dataset. This prioritization is again application case specific and can therefore be used as a hint for a default data quality assessment proposal for special kinds of geodata. Another journal publication for the SIGSPATIAL journal has also been in preparation. Building up on the publication at LBS conference in Zurich, two cities in Thuringia should be tested for eligibility of change prediction. Based on the building footprint data of the respective cities and more recent OpenStreetMap data of the same area, data quality metrics have been calculated and used as features for the prediction of either a change, addition, deletion or nonchange of said map data. Preliminary results hint at the classifier having a good precision overall in identifying the four cases, but work is still being done on improving the recall of said classifications. In preparation of a new project proposal several prototypes to showcase data integration have been developed and published in the SemGIS Github repository. Those prototypes implement uplift and downlift functionality and can optionally incorporate a data quality layer as presented in the Berlin proposal.
Multi-agent simulation for disaster management:
The previous research years in this part of SemGIS project have resulted in the design of a semantic model for disaster management domain (2016), and a multi-agent simulation meta-model adapted to disaster management simulation (2017). The presentation of these researches has been done in 2018 through a publication into the International Journal of Information Systems for Crisis Response and Management and a presentation in the conference Spatial Analysis and GEOmatics (SAGEO) 2018. According to the multi-agent meta-model designed to be adapted to different plans of disaster management, researches have been led to create a process to generate automatically a multi-agent simulation corresponding to specific experiments according to information related disaster management (DM). This process is based on a knowledge base and a reasoner that uses knowledge about disaster management and knowledge about multi-agent simulation through their semantic models. The process is composed of three steps: (i) integrate specific knowledge of disaster management of an administrative area, (ii) execute the reasoning on disaster management knowledge to produce multi-agent models, (iii) execute the automatic implementation of models to launch experiments. The first step uses the set of functionalities developed to integrate, manipulate and enrich semantically geospatial data into the semantic geographic information system and fills the knowledge base with information of the real world. This geospatial information about an administrative area completes knowledge about disaster management represented into the knowledge base through an ontology called semDM. The second step results in the multi-agent simulation modeling into the knowledge base by enrichment of the ontology called semMAS that represents multi-agent simulation knowledge including the previously designed multi-agent meta-model. The reasoner uses a set of rules based on concepts of the ontology semDM to produce new concepts and instances into the semMAS ontology. The third step uses then, the semMAS ontology and a set of implemented models to generate the implementation of the model in order to execute the simulation.
The developments result in the first version of SemGIS prototype, whose overview is given in Figure 1. This prototype provides functionalities of data uplift from different data sources, enrichment through the designed knowledge base and the Linked Open Data (see an example in Figure 2) and of data downlift in different standard formats. This prototype is based on tools development for data integration, xml schema to owl extraction, integration of map styles and further provenance, data quality and metadata as well as uplift and downlift functionalities. These functionalities are provided through a web interface presented in Figure 3.
For data quality in particular, a comparison web map app using leaflet has been developed in order to showcase differences (Figure 4). In addition, data quality prediction research has been conducted and a journal publication is anticipated about classifying map changes in the area of Jena and Erfurt in Thuringia as a proof of concept.
Developments for automated the generation of multi-agent simulation for disaster management (Figure 5) allow us for generating a set of simulation experiments for a first use case. This use case corresponds to the simulation of a plan created to manage a lot of victims during a disaster. It makes intervene several stakeholders at different levels of action or decision: the municipality at the strategical level, fire-fighter officer and medical director at the tactical level, and medical staff as doctors, nurse or ambulance but also, fire-fighters, at an operational level. Further development and experiments will be conducted the next year to apply this process on another use case to identify the limits of the solution.