The object recognition is a large domain of research, which is applied in different types of data like images, point clouds, and videos. Many strategies and tools have been developed in order to achieve object recognition. However, these existing systems are mainly specialized in a type of data. This is due to the fact that the usage of algorithms depends on the type of data. For example, an algorithm as Ransac algorithm can be applied in images as a point cloud, but its implementation won't be the same according to if its application is for the 2d or for the 3D. Moreover, the object recognition is still difficult in large-scale data due to their internal heterogeneity as a non-uniform density in a point cloud, which needs an adaptation of algorithms and their parameter. That is why the KnowDIP aims at the creation of a framework allowing for object detection which will be able to adapt automatically and dynamically its process according to the type and the specificity of the data but also according to the features of the targeted object. The goal of the KnowDIP project is to achieve such a framework thanks to the adding of meaning. The adding of meaning can be done by the use of semantic technologies as an ontology which allows representing human knowledge. In the domain of object recognition, an example of knowledge, which is necessary is to know what is that type of data and what are the algorithms which can be applied to this type of data. For example, to process a point cloud, it needs to know that is a 3D data and to know what algorithms can be applied in 3D data and be relevant for the targeted object. Therefore, our goal is to create a framework able to understand and use knowledge to guide the object detection in order to improve its accuracy and its adaptability.
The KnowDIP framework is composed of four modules: a module of knowledge, a reasoning module, a toolbox of algorithms and a bridge between the knowledge base and the algorithms toolbox. The knowledge module is an ontology which represents three main types of concepts (data, objects, and algorithms) and their relations (see Figure 1). The toolbox contains algorithms for object recognition processing. The reasoning module uses the vocabulary and the information defined in the knowledge module to determine among the toolbox the set of algorithm necessary for the asked task (depending mainly on the targeted object and the data given by the user) and determine how to use and combine together these algorithms. This reasoning module uses the knowledge of data and objects to build a context of processing which supports the algorithms selection. This selection is based on the linking between the features of algorithms and the features of the context. The bridge between the knowledge base and the algorithms toolbox allows for executing the adapted algorithms and for enriching the knowledge base from the result of the algorithm execution results. This structure facilitates the increase of the framework through the enrichment of the knowledge module and the algorithm toolbox.
After the first application on a point cloud provided by NavVIS and representing a modern building (see Figure b of the main image), the framework has been used to detect objects in a point cloud of cultural heritage representing Ephesos provided by the Austrian Archaeological Institute (Ă–AI) and the RĂ¶misch-Germanisches Zentralmuseum (RGZM). This application case aimed at detecting a watermill in this point cloud. The knowledge base created for the first use case and describing the following objects: floor, wall, ceiling, and room, has been used and enriched by the description of a watermill. A watermill is composed of two specific rooms (see description Figure 2).
The process of detection begins by identifying the biggest objects corresponding to floors (ceiling being inexistent), and continue by identifying walls. The result of the detection of these two objects is added to the knowledge base. Thanks to the reasoning process and the analysis of the topological links between the detected objects, rooms are detected. Using the detection of the rooms and the description of a watermill in the knowledge base, to detect the watermill into the point cloud. This work has been published in the Proceedings of Structural Analysis of Historical Constructions.
The Ephesos point cloud is composed of 7 floors, 16 walls, 6 rooms, and 1 watermill. The process of detection has allowed identifying 6 floors, 14 walls, 5 rooms, and 1 watermill â€ś(see Figure 3). The missing walls in the detection are due to their structure, which are damaged and do not respect the description in the knowledge base. The missing floor and rooms have been detected but merged with another floor and another room, respectively. These results highlight the limitations of this process depending on the object description into the knowledge base and its representation into the data. In spite of this limit, the framework has shown certain robustness to identify damaged objects thanks to their topological relations.