MCL Research on SSL-based Image Classification
Image classification has been studied for many years as a fundamental problem in computer vision. With the development of convolutional neural networks (CNNs) and the availability of larger scale datasets, we see a rapid success in the classification using deep learning for both low- and high-resolution images. Although being effective, one major challenge associated with deep learning is that its underlying mechanism is not transparent.
Being inspired by deep learning, the successive subspace learning (SSL) methodology was proposed by Kuo et.al. in a sequence of papers. Different from deep learning, SSL-based methods learn feature representations in an unsupervised feedforward manner using multi-stage principle component analysis (PCA). Joint spatial-spectral representations are obtained at different scales through multi-stage transforms. Three variants of the PCA transform were developed. They are the Saak transform [1], the Saab transform [2], and the channel-wise (c/w) Saab transform [4]. Two SSL-based image classification pipelines, PixelHop [3] and PixelHop++ [4], were designed based on the Saab transform and c/w Saab transform respectively. Both follow the traditional pattern recognition paradigm and partition the classification problem into two cascaded modules: 1) feature extraction and 2) classification. Every step in PixelHop/PixelHop++ is explainable, and the whole solution is mathematically transparent.
To further improve the performance, we propose a SSL-based two-stage sequential image classification pipeline, named E-PixelHop method. The motivation is that for a multi-class classification problem, it is easier to distinguish between classes of dissimilarity than those of similarity. For example, one should distinguish between cats and cars better than between cats and dogs. Along this line, one can build a hierarchical relation among multiple classes based on their semantic meaning to improve classification performance. Instead of manually constructing the hierarchical learning structure before classification, E-PixelHop resolves [...]








