MCL Research on Large-scale 3D Indoor Scene Semantic Segmentation
3D point cloud segmentation requires the understanding of both the global geometric structure and the fine-grained details of each point. According to the segmentation granularity, 3D point cloud segmentation methods can be classified into three categories: semantic segmentation (scene level), instance segmentation (object level) and part segmentation (part level). Our research focuses on solving the semantic segmentation problem. Given a point cloud, the goal of semantic segmentation is to separate it into several subsets according to the semantic meanings of points. There are two common scenes, urban scene and indoor scene. Efficient semantic segmentation of large-scale 3D point clouds is a fundamental and essential capability for real-time intelligent systems, such as autonomous driving and augmented reality. We mainly try to do the large-scale 3D indoor scene semantic segmentation more efficiently. The representative large-scale public is the indoor S3DIS dataset [1].
A key challenge is that the raw point clouds acquired by depth sensors are typically irregularly sampled, unstructured and unordered. Recently, the pioneering work PointNet [2] was proposed for directly processing 3D point clouds by learning per-point features using shared multilayer perceptrons (MLPs) and max pooling. Its following works try to capture wider context information for each point. Although these approaches achieve impressive results for object recognition and semantic segmentation, almost all of them are limited to extremely small 3D point clouds and cannot be directly extended to larger scale without preprocessing such as block partition.
We design a different data preprocessing method to learn large scale data directly. Each room in the dataset is treated as an input sample and feed into an unsupervised feature extractor to obtain point-wise features. The unsupervised feature extractor is developed upon our previous work PointHop [3]. It is extremely [...]