Congratulations to Eddy Wu for passing his Qualifying Exam on January 13, 2016. The title of his Ph.D. thesis proposal is “Deep Learning Techniques for Supervised and Semi-Supervised Pedestrian Detection”. His qualifying exam committee consisted of Jay Kuo (Chair), Sandy Sawchuk, Richard Leahy, Justin Haldar and Aiichiro Nakano (Outside Member).

Abstract of thesis proposal:

With the emergence of autonomous driving and the advanced driver assistance system (ADAS), the importance of pedestrian detection has increased significantly. A lot of research work has been conducted to tackle this problem with the availability of large-scale datasets. Methods based on the convolutional neural network (CNN) technology have achieved great success in pedestrian detection in recent years, which offers a giant step to the solution of this problem.  Although the performance of CNN-based solutions reaches a significantly higher level than traditional methods, it is still far from perfection. Further advancement in this field is still demanded. In this proposal, we conducted two research topics along this direction.

In the first topic, a boosted convolutional neural network (BCNN) system is proposed to enhance the pedestrian detection performance. Being inspired by the classic boosting idea, we develop a weighted loss function that emphasizes challenging samples in training a convolutional neural network (CNN). Two types of samples are considered challenging:

1) samples with detection scores falling in the decision boundary, and

2) temporally associated samples with inconsistent scores. A weighting scheme is designed for each of them. Finally, we train a boosted fusion layer to benefit from the integration of these two weighting schemes. We use the Fast-RCNN as the baseline and test the corresponding BCNN on the Caltech pedestrian dataset in the experiment and observe a significant performance gain of the BCNN over its baseline.

Data-driven pedestrian detection methods demand a large amount of human labeled data as the training samples. The performance of these detectors is highly dependent on the amount of labeled data. Since data labeling is time-consuming, labeled datasets are often insufficient to train a robust detector in real world applications. On the other hand, it is relatively easy to collect unlabeled data. Thus, it is desirable to develop unsupervised or weakly-supervised learning methods that exploit unlabeled data for further performance improvement in the training of a detector. The domain adaptation technique is developed to reach this goal.

In the second topic, a semi-supervised learning method is proposed for pedestrian detection based on domain adaptation. It is observed that the deep representation, which is the response of an input through a CNN, is powerful in estimating the class of unlabeled data. Being motivated by this observation, we propose a clustered deep representation adaptation (CDRA) method. It trains an initial detector using a small number of labeled data, extracts the deep representation and, then, clusters samples based on the space spanned by the deep representation.  A purity measurement mechanism is applied to each cluster to provide a confident score to the estimated class of unlabeled data.  Finally, a weighted re-training process is adopted to fine-tune the model by balancing the numbers of labeled and estimated data.  The CDRA method is shown to achieve the state-of-the-art performance against a large scale dataset.