Usually, the early learning-based point cloud classification methods were developed under the assumption that all point clouds in the dataset are well aligned with the canonical axes. In such scenarios, the 3D Cartesian point coordinates were to learn features. As a consequence, when input point clouds were not aligned, the classification performance dropped significantly. The same assumption holds true in the PointHop and PointHop++ methods proposed by MCL.

In our work SO(3)-Invariant PointHop (or S3I-PointHop in short), we analyze the reason for failure of PointHop due to pose variations, and solve the problem by replacing its pose dependent modules with rotation invariant counterparts. Furthermore, we significantly simplify the PointHop pipeline by using only one single hop along with multiple spatial aggregation techniques. We begin by aligning the point cloud to its three principal axes. This offers a coarse alignment and comes with several ambiguities such as due to eigen vector sign and object asymmetries. The feature extraction process consists of constructing local and global point features. The geometric features are derived from distances and angles in a local point cloud neighborhood. Similarly, the covariance features are found by performing eigen decomposition of the local covariance matrix. The geometric and covariance features form the set of local features. The global features comprise of omni-directional octant features of points in the 3D space similar to PointHop. Later, the Saab transform is conducted.

For aggregating the local and global point features into a global shape feature, conical and spherical aggregation is proposed. For conical aggregation, along each positive and negative principal axes, cones with tip at the origin and at unit distance along the axis are constructed. Then, only the features of points lying inside the respective cones are pooled together. Similarly, for spherical aggregation, spheres of unit diameter centered at half unit distance from each positive / negative principal axis are considered and points lying inside those spheres are pooled. The proposed aggregation scheme is suitable for aggregating point features from different spatial regions instead of aggregating all points together. The pooling schemes used are maximum, mean, l1, and l2 norm. Finally, a subset of discriminant features are selected using the Discriminant Feature Test (DFT) and the selected features are fed to a classifier. S3I-PointHop outperforms PointHop-like methods on point cloud classification task for 3D objects with arbitrary rotations.

– Pranav Kadam