USC Media Communications Lab – 2017

Permalink Gallery
Congratulations to Yuzhuo Ren for Passing Her Defense

Congratulations to Yuzhuo Ren for Passing Her Defense

Congratulations to Yuzhuo Ren for passing her defense on April 26, 2017. Her Ph.D. thesis is entitled “Machine Learning Techniques for Outdoor and Indoor Layout Estimation”.

Abstract of thesis:

In my dissertation, I study three research problems: 1) Outdoor geometric labeling, and 2) Indoor layout estimation and 3) 3D object detection.

A novel method that extracts global attributes from outdoor images to facilitate geometric layout labeling is proposed. The proposed Global-attributes Assisted Labeling (GAL) system exploits both local features and global attributes. The performance of the proposed GAL system is demonstrated and benchmarked with several state-of-the-art algorithms against a popular outdoor scene layout dataset.

Existing solutions to indoor layout estimation largely rely on hand-craft features and vanishing lines. They often fail in highly cluttered indoor scenes. The proposed coarse-to-fine indoor layout estimation (CFILE) method consists of two stages: 1) coarse layout estimation; and 2) fine layout localization. In the first stage, we adopt a fully convolutional neural network (FCN) to obtain a coarse-scale room layout estimate that is close to the ground truth globally. In the second stage, we formulate an optimization framework that enforces several constraints such as layout contour straightness, surface smoothness and geometric constraints for layout detail refinement. The proposed CFILE system offers the state-of-the-art performance on two common benchmark datasets.

Given a RGB-D image, we examine the 3D object detection problem with an objective to produce a bounding box around the object and classify its category. This is a challenging problem due to high intra-class variance, illumination change, background clutter and occlusion. Here, we propose a novel solution that integrates the context information together to provide a robust 3D object detection solution. Extensive experiments are conducted to demonstrate that the proposed Context-3D method achieves the [...]

By Ruiyuan Lin|April 30th, 2017|News|Comments Off|

Permalink Gallery
Congratulations to Qin Huang for receiving the Capocelli Award for the 2017 Data Compression Conference

Congratulations to Qin Huang for receiving the Capocelli Award for the 2017 Data Compression Conference

MCL member, Qin Huang, was awarded the Capocelli Prize at the 2017 Data Compression Conference for his paper entitled “Measure and Prediction of HEVC Perceptually Lossy/Lossless Boundary QP Values”. We are so glad to have him share about his experience in attending the conference. Here is his sharing.
Earlier in this April, I attended the Data Compression Conference 2017 in Snowbird, Utah to present our paper ‘Measure and Prediction of HEVC Perceptually Lossy/Lossless Boundary QP Values’. The work is designed to provide with a dynamic prediction framework that could help estimate the minimum encoding QPs to offer perceptually similar quality.

The conference is held on the beautiful mountain of Snowbird, and it was a really amazing experience to share our work with all the experts in the field. I met with a lot of researchers both in academy and industry, and we discussed constantly after the conference.

It was my great honor to receive the 2017 Capocelli Award, and I really appreciate all the DCC program committee for considering me for the award.

Thank you all!

By Ruiyuan Lin|April 26th, 2017|News|Comments Off|

Permalink Gallery
Professor Kuo visited Stanford University for Research Collaboration

Professor Kuo visited Stanford University for Research Collaboration

Professor C.-C. Jay Kuo visited the Biomedical Engineering Department in the James H. Clark Center of the Stanford University on April 17 to attend a research project meeting. Professor Tsung Hsiai of UCLA, Professor Alison Marsden of Stanford and Professor Jay Kuo have a joint NIH project on “Sheer Stress and Light-Sheets to Study Cardiac Trabeculation”. In this project, the team developed a Super Resolution Light-Sheet Microscopy (SRLSM) technology to advance the field of mechanotransduction and cardiac development. The role of MCL in this project is to incorporate computational algorithm to synchronize the cardiac cycle with the SRLSM-captured images for reconstruction of 4-D (3D + time) simulation. Two MCL PhD students have contributed to this project. They are Hao Xu and Ruiyuan Lin. Hao already graduated in 2017 January and now is working at Google. The MCL team provides the expertise to capture the beating hearts by period determination, relative shift determination, absolute shift determination, and post-processing.

Professor Kuo was very impressed by Stanford’s beautiful campus. He visited the campus before about 30 years ago. He hopes to have more opportunities to visit the Silicon Valley to meet MCL alumni.

By Ruiyuan Lin|April 22nd, 2017|News|Comments Off|

Permalink Gallery
Professor Kuo Talked about Deep Learning in MHI Emerging Trends Series

Professor Kuo Talked about Deep Learning in MHI Emerging Trends Series

Professor Kuo Gave a Talk on Deep Learning at Ming Hsieh Institute

The Ming Hsieh Institute has launched an MHI Emerging Trends Series. MCL Director, Professor C.-C. Jay Kuo, was the first speaker in this series. Professor Kuo gave his talk on deep learning on April 10 (Monday), 2017.

There is a resurging interest in developing a neural-network-based solution to supervised machine learning in the last 5 years. However, little theoretical work was reported in this area. In his talk, Professor Kuo attempted to provide some theoretical foundation to the working principle of the convolutional neural network (CNN) from a signal processing viewpoint. First, he introduced the RECOS transform as a basic building block for CNNs. The term “RECOS” is an acronym for “REctified-COrrelations on a Sphere”. It consists of two main concepts: data clustering on a sphere and rectification. Then, a CNN is interpreted as a network that implements the guided multi-layer RECOS transform. Along this line, he also compared the traditional single-layer and modern multi-layer signal analysis approaches. Furthermore, he discussed how guidance can be provided by data labels through backpropagation in the training with an attempt to offer a smooth transition from weakly to heavily supervised learning. Finally, he pointed out several future research directions at the end.

There were about 80 people attending Professor Kuo’s seminar. Many questions were asked after his talk. Professor Kuo said that he enjoyed the interaction with the audience very much and it demonstrated the strong interest of the audience on this topic.

By Yiyue Zhang|April 17th, 2017|News|Comments Off|

Permalink Gallery
MCL works on Interactive Advisement for Smart TV

MCL works on Interactive Advisement for Smart TV

When watching images/videos on a TV, we often have many questions about the image/video. What is the name of the beautiful places? What is the name of the actors? Which store sell the actor’s car at big discounts? Imagine one day we have a smart TV which can interactively answer your questions, and recommend relevant shopping/travel advertisements. We will enjoy more convenience and have more funs on watching TV.

MCL members, Bing Li, Zhehang Ding and Yuhang Su are collaborating with Samsung Company on Interactive Advisement for Smart TV. At the first year, we focus on automatic image/video caption. Image/video caption is to describe an image/video by a sentence instead of detecting objects.

Currently, we propose three pipelines for this project. The first pipeline is general image caption. The second and third pipeline are respectively place aware caption and face aware caption, such that our system can achieve better performance in vertical industrials such as travel, entertainment, sport and etc. For general image caption, we develop a detection method which achieves 84% mAP. For place-ware annotation, since no image datasets is for world-wide famous places, we collect images from 118 famous places in 21 countries to construct a landmark dataset. For face aware annotation, we construct a celebrity dataset, and face detection and face recognition method based on CNN.

In our future work, we will put more efforts into video caption.

By Yiyue Zhang|April 11th, 2017|News|Comments Off|

Permalink Gallery
MCL Members Chi-Hao Wu and Siyang Li Presented Their Research Work at WACV 2017

MCL Members Chi-Hao Wu and Siyang Li Presented Their Research Work at WACV 2017

MCL members, Chi-Hao (Eddy) Wu and Siyang Li presented their papers at Winter Conference on Applications of Computer Vision (WACV) 2017, Santa Rosa, CA, USA

The title of Eddy’s paper is “Boosted Convolutional Neural Networks (BCNN) for Pedestrian Detection”, with Weihao Gan, De Lan and C.-C. Jay Kuo as the co-authors. Here is a brief summary:

“In this work, a boosted convolutional neural network (BCNN) system is proposed to enhance the pedestrian detection performance. Being inspired by the classic boosting idea, we develop a weighted loss function that emphasizes challenging samples in training a convolutional neural network (CNN). Two types of samples are considered challenging: 1) samples with detection scores falling in the decision boundary, and 2) temporally associated samples with inconsistent scores. A weighting scheme is designed for each of them. Finally, we train a boosted fusion layer to benefit from the integration of these two weighting schemes. We use the Fast-RCNN as the baseline, and test the corresponding BCNN on the Caltech pedestrian dataset in the experiment, and show a significant performance gain of the BCNN over its baseline.”

Siyang’s paper is entitled “Box Refinement: Object Proposal Enhancement and Pruning”, co-authored with Heming Zhang, Junting Zhang, Yuzhuo Ren and C.-C. Jay Kuo. The summary goes as followed:

“Object proposal generation has been an important preprocessing step for object detectors in general and the convolutional neural network (CNN) detectors in particular. Recently, people start to use the CNN to generate object proposals but most of these methods suffer from the localization bias problem, like other objectness-based methods. Since contours offer a powerful cue for accurate localization, we propose a box refinement method by searching for the optimal contour for each initial bounding box that minimizes the contour cost. [...]

By Yiyue Zhang|April 5th, 2017|News|Comments Off|

Permalink Gallery
MCL Works on Text Localization

MCL Works on Text Localization

Spotting text in a natural scene image is a challenging task. It involves text localization in the image and text recognition given these localized text image patches. To tackle this problem, traditional optical character recognition (OCR) techniques – which are designed specifically for black and white text contents – give way to more sophisticated methods like neural networks.

Yuanhang Su, one MCL member, is now collaborating with Inha University, Korean Airline and Pratt & Whitney institute for collaborative engineering (PWICE) to build a text spotting system. Our lab has developed a comprehensive text spotting system that can localize and recognize text in natural scene images by using combined convolutional neural network (CNN) and recurrent neural network (RNN) architecture. Our system is able to deal with English and Korean text contents.

By Yiyue Zhang|April 2nd, 2017|News|Comments Off|

Monthly Archives: April 2017

Congratulations to Yuzhuo Ren for Passing Her Defense

Congratulations to Yuzhuo Ren for Passing Her Defense

Congratulations to Qin Huang for receiving the Capocelli Award for the 2017 Data Compression Conference

Congratulations to Qin Huang for receiving the Capocelli Award for the 2017 Data Compression Conference

Professor Kuo visited Stanford University for Research Collaboration

Professor Kuo visited Stanford University for Research Collaboration

Professor Kuo Talked about Deep Learning in MHI Emerging Trends Series

Professor Kuo Talked about Deep Learning in MHI Emerging Trends Series

MCL works on Interactive Advisement for Smart TV

MCL works on Interactive Advisement for Smart TV

MCL Members Chi-Hao Wu and Siyang Li Presented Their Research Work at WACV 2017

MCL Members Chi-Hao Wu and Siyang Li Presented Their Research Work at WACV 2017

MCL Works on Text Localization

MCL Works on Text Localization

Recent Posts