MCL Works on Road Detection for Autonomous Driving

Advanced driver assistance systems (ADAS) have attracted more and more attention nowadays, where various IT technologies are introduced to vehicles to enhance driving safety and automation.

MCL alumni, Junting Zhang and Yuhang Song, together with MediaTek Inc. have started a collaborative research project on ADAS-oriented deep learning technologies since January 2016. Single-image-based traffic scene segmentation and road detection have been studied extensively throughout 2016. We adapted the state-of-the-art general-purpose CNN architectures to urban scene semantic segmentation task, overcoming the cross-domain issue. On the other hand, computational and memory efficiency have always been our major concerns, we were also devoted to simplify the network structure and reduce redundant computation.

In 2017, we will explore the deep learning technologies for video processing. Although there are many interesting results in semantic urban scene understanding based on the CNN technology, semantic video understanding is still a challenging problem. We will try to find a semantic video understanding method that outperforms the single-image-based algorithms. To address this type of problems, we will exploit the temporal information.

By |February 19th, 2017|News|Comments Off on MCL Works on Road Detection for Autonomous Driving|

MCL Works on Drone Detection for Airport Safety

Nowadays, there is the growing popularity of commercial and recreation use of the drones which are the new threat to the airline safety. In Fall 2016, USC MCL, Inha University, Korean Air and Pratt & Whitney Institute for Collaborative Engineering (PWICE) started a joint research project to build a drone monitoring system to improve the airport security. USC MCL is held responsible for providing autonomous imaging-based drone monitoring system. One of MCL alumni, Yueru Chen, is working on this project.

On Thursday, February 9th, 2017, an midterm discussion of current projects was held between UTC Pratt-Whitney, Korean Airlines, USC, and Inha University. For the drone monitoring project, in attendance were Media Communications Lab’s Dr. Jay Kuo, Dr. Jongmoo Choi, Master’s student Pranav Aggarwal and Ph.D. student Yueru Chen, who presented their ongoing work on imaging-based drone monitoring system. USC MCL has developed comprehensive approaches including two modules, detection and tracking, with the use the deep learning methods. The proposed system was designed to detect illegal drones’ position and track their movement. During the meeting, USC MCL showed the promising results and discussed the future plan.

By |February 12th, 2017|News|Comments Off on MCL Works on Drone Detection for Airport Safety|
  • unnamed
    Permalink Gallery

    Congratulations to MCL Alumni for Joining Facebook, Google, Apple, and Bloomberg

Congratulations to MCL Alumni for Joining Facebook, Google, Apple, and Bloomberg

We would like to say congratulations to five of MCL Alumni–Chen Chen, Shangwen Li, Hao Xu, Jian Li, and Xiaqing Pan–for passing their PhD defenses and starting their career lives in such great companies of the U.S.. This year, we will have Chen and Shangwen joining Facebook, Hao joining Google, Jian joining Apple, and Xiaqing joining Bloomberg. We are so glad to have them to share their experience and advises for us.

Chen Chen:
I joined MCL 6 years ago as a master student. In the first seminar talk, Prof. Kuo’s sharing totally changed my view of being a researcher. It is the moment that I made the hard decision to become a PhD student. As expected, joining MCL creates unexpected benefits to my life. The precious platform provides many opportunities for me to learn and to grow. It allows me to meet many excellent students and scholars. Working with them is like visualizing myself in a mirror, which teaches me to know myself before growing. MCL also creates huge challenges to push me to the extreme. Every exam, project due and paper submission are companied with thousands of hours of rigorous and humble team work. “Aim high and act low” is the most impressive lesson I learned from the hard process.
It is the precious experience in MCL that offers the Facebook position at the end of my PhD. However, different from what I learned in MCL, I expect to gain more working experience and entrepreneurship in Facebook. I would also like to see more students from our lab can join Facebook to strengthen the MCL alumina team in the future.

Shangwen Li:
During my past internship at Facebook, I worked on a [...]

By |February 5th, 2017|News|Comments Off on Congratulations to MCL Alumni for Joining Facebook, Google, Apple, and Bloomberg|

Congratulations to Hao Xu for Passing His PhD Defense

Congratulations to Hao Xu for passing his defense on January 23, 2016. His Ph.D. thesis is entitled “Understanding Deep Learning from Its System Architectures, Feature Representations to Applications”.

Abstract of thesis:

Deep learning plays key roles in various aspects of the modern computer vision research. Our research focused on analyzing, adopting, and developing better CNN architectures which outperform the previous methods. To begin with, a car detection method using deformable part models consisting of composite feature sets (DPM/CF) is proposed. It recognizes cars of various types and from multiple viewing angles. The DPM/CF system consists of two stages. In the first stage, an HOG template is used to detect the bounding box of the entire car of a certain type and viewed from a certain angle (called a t/a pair), which yields a region of interest (ROI). In the second stage, we detect each salient part in a given t/a-specific ROI using either the HOG or the CNN feature. An optimization procedure based on latent logistic regression is adopted to choose the most discriminative location/size and the most suitable feature set for each part automatically. It is observed that the DPM/CF detector can strike a balance between detection performance and training complexity, through selecting the capable and simple feature from the composite feature set. Extensive experimental results are given to demonstrate the superior performance of the proposed DPM/CF method.

The CNN features used in DPM/CF demonstrate strong performance in detecting objects from images. To analyze the strength and weakness of the CNN feature representation, two quantitative metrics are proposed for the automatic evaluation of trained features at different convolution layers. The Gaussian confusion measure (GCM) is used to identify the discriminative ability of an individual feature, while [...]

By |January 29th, 2017|News|Comments Off on Congratulations to Hao Xu for Passing His PhD Defense|

Congratulations to Xiaqing Pan for Passing His PhD Defense

Congratulations to Xiaqing Pan for passing his defense on January 23, 2016. His Ph.D. thesis is entitled “Machine Learning Methods for 2D/3D Shape Retrieval and Classification”.

Abstract of thesis:

Shape classification and retrieval are two important problems in both computer vision and computer graphics. A robust shape analysis contributes to many applications such as manufacture components recognition and retrieval, sketch-based shape retrieval, medical image anaysis, 3D model repository management, etc. In this dissertation, we propose three methods to address three significant problems such as 2D shape retrieval, 3D shape retrieval and 3D shape classification, respectively.

First, in the 2D shape retrieval problem, most state-of-the-art shape retrieval methods are based on local features matching and ranking. Their retrieval performance is not robust since they may retrieve globally dissimilar shapes in high ranks. To overcome this challenge, we decompose the decision process into two stages. In the first irrelevant cluster filtering (ICF) stage, we consider both global and local features and use them to predict the relevance of gallery shapes with respect to the query. Irrelevant shapes are removed from the candidate shape set. After that, a local-features-based matching and ranking (LMR) method follows in the second stage.  We apply the proposed TSR system to three shape datasets: MPEG-7, Kimia99 and Tari1000. We show that TSR outperforms all other existing methods. The robustness of TSR is demonstrated by the retrieval performance.

Second, a novel solution for the content-based 3D shape retrieval problem using an unsupervised clustering approach, which does not need any label information of 3D shapes, is presented.  The proposed shape retrieval system consists of two modules in cascade: the irrelevance filtering (IF) module and the similarity ranking (SR) module. The IF module attempts to cluster gallery shapes that [...]

By |January 26th, 2017|News|Comments Off on Congratulations to Xiaqing Pan for Passing His PhD Defense|

Congratulations to Qin Huang for Passing His Qualifying Exam

Congratulations to Qin Huang for passing his Qualifying Exam on January 18, 2016. The title of his Ph.D. thesis proposal is “Machine Learning Techniques for Perceptual Quality Enhancement and Semantic Image Segmentation”. His qualifying exam committee consisted of Jay Kuo (Chair), Antonio Ortega, Justin Haldar, Sandy Sawchuk and Cyrus Shahabi (Outside Member).

Abstract of thesis proposal:

Researches on image processing and computer vision problems can be generally divided into two major steps: extracting powerful feature representations and designing efficient decision system. Traditional methods rely on hand-craft features, as well as pre-defined thresholds to generate a necessary condition required for the desired target. A more robust system could be designed taking advantage of machine learning techniques if enough training samples are provided. Thanks to the development of big data, millions of image and video contents are now available for training. To better utilize the information in the training, convolutional neural network based deep learning systems become popular in recent years. Specifically, the CNN based methods demonstrate better ability to acquire powerful feature representations in a simultaneous way. However, CNN based training has a high requirement of hardware and subtle process design. And therefore it should be carefully explored in order to obtain desired results.

In this proposal, we contribute to three works that gradually develop from the traditional method to deep learning based method. Based on the applications, the works can be divided into two major categories: perceptual quality enhancement and semantic image segmentation. In the first part, we focus on enhancing the quality of images and videos by considering related perceptual properties of human visual system. To begin with, we deal with a type of compression artifacts referred to as “false contour”. We then focus on the visual experience [...]

By |January 22nd, 2017|News|Comments Off on Congratulations to Qin Huang for Passing His Qualifying Exam|
  • chunting_1_resized
    Permalink Gallery

    Congratulations to Chun-Ting Huang for Passing His PhD Defense

Congratulations to Chun-Ting Huang for Passing His PhD Defense

Congratulations to Chun-Ting Huang for passing his defense on January 18, 2016. His Ph.D. thesis is entitled “Facial Identity Recognition and Attribute Classification Using Machine Learning Techniques”.

Abstract of thesis:

Robust face recognition plays a central role in biometric and surveillance applications.  Although the subject has been studied for about four decades, there still exist quite a few technical challenges and system design issues in deploying it in a real-world video surveillance environment.  Nowadays, the raw face images and their associated meta data are stored in a remote cloud storage system in a distributed face recognition platform.  One key challenge in the overall system design is to ensure the security of stored data. In this research, we first conduct a survey on this technology and then, study the problems of cross-distance/environment face recognition and facial attribute classification with machine learning techniques.

The problem of long distance face recognition and attribute classification arising from surveillance applications impose major challenges. The captured face from the surveillance system can be low resolution and quality, which is further degraded by an uncontrolled outdoor environment such as long distance during daytime or nighttime. In addition, human age/gender inferred by face images are fundamental attributes in our social interactions. This research has many applications such as demographics analysis, commercial user management, visual surveillance, and even aging progression. Despite the rapid development in automatic face recognition, there is far less work on automatic age/gender classification in an unconstrained environment.

Research in this dissertation provides effective solutions to three topics: 1) cross-distance/environment face recognition, 2) cross-distance/spectral face recognition and 3) age/gender classification.  For Topic 1, a two-stage alignment/enhancement filtering (TAEF) method is proposed to achieve the state-of-the-art performance.  For Topic 2, a locally linear embedding (LLE) [...]

By |January 19th, 2017|News|Comments Off on Congratulations to Chun-Ting Huang for Passing His PhD Defense|

Congratulations to Eddy Wu for Passing His Qualifying Exam

Congratulations to Eddy Wu for passing his Qualifying Exam on January 13, 2016. The title of his Ph.D. thesis proposal is “Deep Learning Techniques for Supervised and Semi-Supervised Pedestrian Detection”. His qualifying exam committee consisted of Jay Kuo (Chair), Sandy Sawchuk, Richard Leahy, Justin Haldar and Aiichiro Nakano (Outside Member).

Abstract of thesis proposal:

With the emergence of autonomous driving and the advanced driver assistance system (ADAS), the importance of pedestrian detection has increased significantly. A lot of research work has been conducted to tackle this problem with the availability of large-scale datasets. Methods based on the convolutional neural network (CNN) technology have achieved great success in pedestrian detection in recent years, which offers a giant step to the solution of this problem.  Although the performance of CNN-based solutions reaches a significantly higher level than traditional methods, it is still far from perfection. Further advancement in this field is still demanded. In this proposal, we conducted two research topics along this direction.

In the first topic, a boosted convolutional neural network (BCNN) system is proposed to enhance the pedestrian detection performance. Being inspired by the classic boosting idea, we develop a weighted loss function that emphasizes challenging samples in training a convolutional neural network (CNN). Two types of samples are considered challenging:

1) samples with detection scores falling in the decision boundary, and

2) temporally associated samples with inconsistent scores. A weighting scheme is designed for each of them. Finally, we train a boosted fusion layer to benefit from the integration of these two weighting schemes. We use the Fast-RCNN as the baseline and test the corresponding BCNN on the Caltech pedestrian dataset in the experiment and observe a significant performance gain of the BCNN over its baseline.

Data-driven pedestrian detection methods demand [...]

By |January 15th, 2017|News|Comments Off on Congratulations to Eddy Wu for Passing His Qualifying Exam|
  • weihao_1_rescaled
    Permalink Gallery

    Congratulations to Weihao Gan for Passing His Qualifying Exam

Congratulations to Weihao Gan for Passing His Qualifying Exam

Congratulations to Weihao Gan for passing his qualifying exam on January 11, 2016. The title of his Ph.D. thesis proposal is “Advanced Online Object Tracking Techniques by Exploiting Spatial and Temporal Information”. His qualifying exam committee consisted of Jay Kuo, Antonio Ortega, Keith Chugg, Panayiotis Georgiou and Ulrich Neumann.

Abstract of thesis proposal:

Online object tracking is one of the fundamental computer vision problems. It is commonly used in real world applications such as traffic control in video surveillance, autonomous vehicle, robotic navigation, medical imaging, etc. It is a very challenging problem due to multiple time-varying attributes in video sequences. In this research, we attempt to achieve online object tracking using both spatial and temporal cues with two novel methods.

First, we develop a new method, called the “temporal prediction and spatial refinement (TPSR)” tracker, to integrate spatial and temporal cues effectively. The TPSR tracking system consists of three cascaded modules: pre-processing (PP), temporal prediction (TP) and spatial refinement (SR). Illumination variation and shaking camera movement are two challenging factors in a tracking problem. They are compensated in the PP module. Then, a joint region-based template matching (TM) and pixel-wised optical flow (OF) scheme is adopted in the TP module, where the switch between TM and OF is conducted automatically. These two modes work in a complementary manner to handle different foreground and background situations. Finally, to overcome the drifting error arising from the TP module, the bounding box location and size are finetuned using the local spatial information of the new frame in the SR module.

Next, we apply the deep neural network architecture to the online object tracking problem. We have made several major improvements on the state-of-the-art multi- domain network (MDNet) tracker. The enhanced MDNet (EMDNet) tracker not [...]

By |January 12th, 2017|News|Comments Off on Congratulations to Weihao Gan for Passing His Qualifying Exam|

MCL Releases VideoSet in IEEE DataPort

The MCL joined a collaborative project to build a large-scale subjective video quality database. The database was proposed to boost a major breakthrough in video coding and processing. It consists of 220 5-second sequences in four resolutions (i.e., 1920×1080, 1280×720, 960×540 and 640×360). For each of the 880 video clips, we encoded it using the H.264 codec and conducted a large-scale subjective test on the perceptual quality. The dataset is called the “VideoSet”, which is an acronym for “Video Subject Evaluation Test (SET)”. The database are available to the public in the IEEE DataPort.

IEEE DataPort is a valuable and universally accessible repository of datasets serving the growing data needs in both research and industry. The repository is designed to accept all types of datasets, including Big Data datasets up to 2TB, and it provides both downloading capabilities and access to Cloud services to enable data analysis in the Cloud.

We appreciate the help from Dr. K. J. Ray Liu and Melissa Handa in hosting the VidoeSet database.

By |January 8th, 2017|News|Comments Off on MCL Releases VideoSet in IEEE DataPort|