Institut für Raumbezogene Informations- und Messtechnik
Hochschule Mainz - University of Applied Sciences

Knowledge-based object Detection in Image and Point cloud (KnowDIP)

The KnowDIP project aims at the conception of a framework for an automatic object detection in unstructured and heterogeneous data. This framework uses a representation of human knowledge in order to improve the flexibility, the accuracy, and the efficiency of the processing.
Motivation und Ziele: 

The object recognition is a large domain of research, which is applied in different types of data like images, point clouds, and videos. Many strategies and tools have been developed in order to achieve object recognition. However, these existing systems are mainly specialized in a type of data. This is due to the fact that the usage of algorithms depends on the type of data. For example, an algorithm as Ransac algorithm can be applied in images as a point cloud, but its implementation won't be the same according to if its application is for the 2d or for the 3D. Moreover, the object recognition is still difficult in large-scale data due to their internal heterogeneity as a non-uniform density in a point cloud, which needs an adaptation of algorithms and their parameter. That is why the KnowDIP aims at the creation of a framework allowing for object detection which will be able to adapt automatically and dynamically its process according to the type and the specificity of the data but also according to the features of the targeted object. The goal of the KnowDIP project is to achieve such a framework thanks to the adding of meaning. The adding of meaning can be done by the use of semantic technologies as an ontology which allows representing human knowledge. In the domain of object recognition, an example of knowledge, which is necessary is to know what is that type of data and what are the algorithms which can be applied to this type of data. For example, to process a point cloud, it needs to know that is a 3D data and to know what algorithms can be applied in 3D data and be relevant for the targeted object. Therefore, our goal is to create a framework able to understand and use knowledge to guide the object detection in order to improve its accuracy and its adaptability.


The KnowDIP framework is composed of four modules: a module of knowledge, a reasoning module, a toolbox of algorithms and a bridge between the knowledge base and the algorithms toolbox. The knowledge module is an ontology which represents three main types of concepts (data, objects, and algorithms) and their relations (see Figure 1). The toolbox contains algorithms for object recognition processing. The reasoning module uses the vocabulary and the information defined in the knowledge module to determine among the toolbox the set of algorithm necessary for the asked task (depending mainly on the targeted object and the data given by the user) and determine how to use and combine together these algorithms. This reasoning module uses the knowledge of data and objects to build a context of processing which supports the algorithms selection. This selection is based on the linking between the features of algorithms and the features of the context. The bridge between the knowledge base and the algorithms toolbox allows for executing the adapted algorithms and for enriching the knowledge base from the result of the algorithm execution results. This structure facilitates the increase of the framework through the enrichment of the knowledge module and the algorithm toolbox.
After the first application on a point cloud provided by NavVIS and representing a modern building (see Figure b of the main image), the framework has been used to detect objects in a point cloud of cultural heritage representing Ephesos provided by the Austrian Archaeological Institute (ÖAI) and the Römisch-Germanisches Zentralmuseum (RGZM). This application case aimed at detecting a watermill in this point cloud. The knowledge base created for the first use case and describing the following objects: floor, wall, ceiling, and room, has been used and enriched by the description of a watermill. A watermill is composed of two specific rooms (see description Figure 2).
The process of detection begins by identifying the biggest objects corresponding to floors (ceiling being inexistent), and continue by identifying walls. The result of the detection of these two objects is added to the knowledge base. Thanks to the reasoning process and the analysis of the topological links between the detected objects, rooms are detected. Using the detection of the rooms and the description of a watermill in the knowledge base, to detect the watermill into the point cloud. This work has been published in the Proceedings of Structural Analysis of Historical Constructions.


The Ephesos point cloud is composed of 7 floors, 16 walls, 6 rooms, and 1 watermill. The process of detection has allowed identifying 6 floors, 14 walls, 5 rooms, and 1 watermill “(see Figure 3). The missing walls in the detection are due to their structure, which are damaged and do not respect the description in the knowledge base. The missing floor and rooms have been detected but merged with another floor and another room, respectively. These results highlight the limitations of this process depending on the object description into the knowledge base and its representation into the data. In spite of this limit, the framework has shown certain robustness to identify damaged objects thanks to their topological relations.


Zeitraum:     01.05.2016 - 30.09.2019
Beteiligte Personen:
  • a) 3D Point Cloud, b) Object detection
, ,


Die Verteidigung seiner Dissertation Mitte November an der Universität Saint-Etienne markierte für Jean-Jacques Ponciano den erfolgreichen Abschluss seiner…

Mit zwei Vorträgen war das i3mainz auf den 18. Oldenburger 3D-Tagen vom 6. bis 7. Februar 2018 vertreten.

Jean-Jacques Ponciano stellte einige…

Verwandte Projekte

Im Zuge dieser Sonderausstellung „Die Bilderwelt der Kelten“ sind keltische Kleinplastiken aus Bronze mittels hochpräziser optischer 3D-Messtechnik in Kombination mit hochauflösend…
Im Zuge der Publikation und Analyse der 3D-Daten eines unberührten Gruftkomplexes zu Qatna ist die Bereitstellung der Daten anhand einer plattformunabhängigen Web-Applikation besch…


Automatic Detection of Objects in 3D Point Clouds Based on Exclusively Semantic Guided Processes


J.J. Ponciano,
A. Trémeau,
F. Boochs


ISPRS International Journal of Geo-Information

In the domain of computer vision, object recognition aims at detecting and classifying objects in data sets. Model-driven approaches are typically constrained through their focus on either a specific type of data, a context (indoor, outdoor) or a set of objects. Machine learning-based approaches are more flexible but also constrained as they need annotated data sets to train the learning process. That leads to problems when this data is not available through the specialty of the application field, like archaeology, for example. In order to overcome such constraints, we present a fully semantic-guided approach. The role of semantics is to express all relevant knowledge of the representation of the objects inside the data sets and of the algorithms which address this representation. In addition, the approach contains a learning stage since it adapts the processing according to the diversity of the objects and data characteristics. The semantic is expressed via an ontological model and uses standard web technology like SPARQL queries, providing great flexibility. The ontological model describes the object, the data and the algorithms. It allows the selection and execution of algorithms adapted to the data and objects dynamically. Similarly, processing results are dynamically classified and allow for enriching the ontological model using SPARQL construct queries. The semantic formulated through SPARQL also acts as a bridge between the knowledge contained within the ontological model and the processing branch, which executes algorithms. It provides the capability to adapt the sequence of algorithms to an individual state of the processing chain and makes the solution robust and flexible. The comparison of this approach with others on the same use case shows the efficiency and improvement this approach brings.

Connected Semantic Concepts as a Base for Optimal Recording and Computer-Based Modelling of Cultural Heritage Objects


J.J. Ponciano,
A. Karmacharya,
S. Wefers,
P. Atorf,


Structural Analysis of Historical Constructions

3D and spectral digital recording of cultural heritage monuments is a common activity for their documentation, preservation, conservation management, and reconstruction. Recent developments in 3D and spectral technologies have provided enough flexibility in selecting one technology over another, depending on the data content and quality demands of the data application. Each technology has its own pros/cons, suited perfectly to some situations and not to others. They are mostly unknown to humanities experts, besides having a limited understanding of the data requirements demanded by the research question. These are often left to technical experts who again have a limited understanding of cultural heritage requirements. A common point of view has to be achieved through interdisciplinary discussions. Such agreements need to be documented for their future references and re-uses. We present a method based on semantic concepts that not only documents the semantic essence of such discussions, but also uses it to infer a guidance mechanism that recommends technologies/technical process to generate the required data based on individual needs. Experts' knowledge is represented explicitly through a knowledge representation that allows machines to manage and infer recommendations. First, descriptive semantics guide end users to select the optimal technology/technologies for recording data. Second, structured knowledge controls the processing chain extracting and classifying objects contained in the acquired data. Circumstantial situations during object recording and the behaviour of the technologies in that situation are taken into account. We will explain the approach as such and give results from tests at a CH object.

Identification and classification of objects in 3D point clouds based on a semantic concept


J.J. Ponciano,
F. Boochs,
A. Trémeau



Knowledge-based object recognition in point clouds and image data sets


J.J. Ponciano,
A. Trémeau