The thesis is entitled “Object Classification Based on Neural-Network-Inspired Image Transforms”.
Abstract of the thesis:
Convolutional neural networks (CNNs) have recently demonstrated impressive performance in image classification and change the way building feature extractors from carefully handcrafted design to automatically deep learned from a large labeled dataset. However, a great majority of current CNN literature are application-oriented, and there is no clear understanding and theoretical foundation to explain the outstanding performance and indicate the way to improve. In this thesis, we focus on solving the image classification problem-based on the neural-network-inspired transforms.
Being motivated by the multilayer RECOS (REctified-COrrelations on a Sphere) transform, two data-driven signal transforms are proposed, called the “Subspace approximation with augmented kernels” (Saak) transform and “Subspace approximation with adjusted bias” (Saab) transform corresponding to each Convolutional layers in CNNs. Based on the Saak transform, We firstly proposed an efficient, scalable and robust approach to the handwritten digits recognition problem. Next, we also develop an ensemble method using Saab transform to solve the image classification problem. The ensemble method fuses the output decision vectors of Saab-transform-based decision system. To enhance the performance of the ensemble system, it is critical to increasing the diversity of FF-CNN models. To achieve this objective, we introduce diversities by adopting three strategies: 1) different parameter settings in convolutional layers, 2) flexible feature subsets fed into the Fully-connected (FC) layers, and 3) multiple image embeddings of the same input source. We also extend our ensemble method to semi-supervised learning. Since unlabeled data may not always enhance semi-supervised learning, we define an effective quality score and use it to select a subset of unlabeled data in the training process. In the last, we proposed a unified framework, called successive subspace learning (SSL). With this new viewpoint, the whole CNN pipeline contains multiple subspace processing modules in cascade. To illustrate the SSL principle in the context of image-based object recognition, we introduce a novel PixelHop method. The PixelHop method provides a rich set of representations for image classification. To further decrease the complexity of the PixelHop system, we develop a new label-assisted dimension reduction method. Extensive experiments are conducted to demonstrate the superior performance of the PixelHop method in terms of classification accuracy and training complexity.
Ph.D. experience:
I would like to thank Professor Kuo and all the lab members. I want you to know how much I appreciate all the time and energy you put into helping me throughout my Ph.D. journey. I have learned a lot during my Ph.D. training, and I think it is a precious life experience. To achieve success as a Ph.D. student, it requires many factors. I would like to share some advises here. We are in a big MCL family and you could be more active to discuss your research with other lab members and Prof. Kuo. It not only helps you make progress in the research but also strengthens your friendships. Communication skill is also very important. We should learn how to express the idea clearly and attract others attention. The quality of your work is the most critical, but if you don’t know how to sale it, your work might be at risk of being ignored. In the last, I would like to wish the best to everyone.