MCL Works on Text Localization

Spotting text in a natural scene image is a challenging task. It involves text localization in the image and text recognition given these localized text image patches. To tackle this problem, traditional optical character recognition (OCR) techniques – which are designed specifically for black and white text contents – give way to more sophisticated methods like neural networks.

Yuanhang Su, one MCL member, is now collaborating with Inha University, Korean Airline and Pratt & Whitney institute for collaborative engineering (PWICE) to build a text spotting system. Our lab has developed a comprehensive text spotting system that can localize and recognize text in natural scene images by using combined convolutional neural network (CNN) and recurrent neural network (RNN) architecture. Our system is able to deal with English and Korean text contents.

By |April 2nd, 2017|News|Comments Off on MCL Works on Text Localization|

MCL Works on Deep Learning based Fashion Fingerprinting

Fashion fingerprint is a compact feature vector for fashion items that can be used for tasks such as recognition, clustering, and retrieval of similar items. It is equally useful for both online fashion retailers as well as for physical apparel stores (with or without their online extensions). A related problem is understanding the apparel preferences of an individual from the dresses that they wear while visiting the physical store. One of the challenges of fashion study different from others is lack of enough accurate annotation. Available datasets have either limited number of images or very noisy annotation.

Currently we have successfully trained a fashion item localization model based on SSD[1]. The model is able to localize upper clothes, bottom clothes and one-pieces and has been tested on the Clothing Parsing dataset[2]. It achieves an F-score of 0.887 on upper clothes localization. For other clothing items, errors occur because the model may focus on too local regions and thus gets confused between skirt and dress. Prior location of human body will be incorporated in our model to solve this problem.

In the future we will further refine our localization model and also work on two directions. One is to recognize the garments based on our localization. The other one is to automatically label more images to enlarge the size of datasets.

[1] Liu, Wei, et al. “SSD: Single shot multibox detector.” European Conference on Computer Vision. Springer International Publishing, 2016.
[2] Liang, Xiaodan, et al. “Deep human parsing with active template regression.” IEEE transactions on pattern analysis and machine intelligence 37.12 (2015): 2402-2414.

By |March 26th, 2017|News|Comments Off on MCL Works on Deep Learning based Fashion Fingerprinting|

MCL Works on Splicing Image Detection

With the advent of Web 2.0 and ubiquitous adoption of low-cost and high-resolution digital cameras, users upload and share images on a daily basis. This trend of public image distribution and access to user-friendly editing software such as Photoshop and GIMP has made image forgery a serious issue. Splicing is one of the most common types of image forgery. It manipulates images by copying a region from one image (i.e., the donor image) and pasting it onto another image (i.e., the host or spliced image). Forgers often use splicing to give a false impression that there is an additional object present in the image, or to remove an object from the image. A spliced image from the Columbia Uncompressed [1] dataset is shown above. Image splicing can potentially be used in generating false propaganda for political purposes. For example, during the 2004 US Presidential election campaign, an image that showed John Kerry and Jane Fonda speaking together at an anti-Vietnam war protest was released and circulated. It was discovered later that this was a spliced image, and was created for political purposes. The spliced image and the two corresponding authentic images can be seen above [2].

Early work on image splicing detection only deduced whether a given image has been spliced or not, and no effort to localize the spliced area was attempted. The problem of joint splicing detection and localization has only been studied in recent years. For the problem of image splicing localization, one has to determine which pixels in an image have been manipulated as a result of a splicing operation.

One of the MCL members, Ronald Salloum, is currently working on an image splicing localization research project funded by the Defense Advanced [...]

By |March 22nd, 2017|News|Comments Off on MCL Works on Splicing Image Detection|
  • Permalink Gallery

    MCL Works on Automatic Medical Image Segmentation with Convolutional Neural Networks

MCL Works on Automatic Medical Image Segmentation with Convolutional Neural Networks

Automatic image segmentation has always been an important topic in medical imaging. Many medical applications, such as delineating heart structures, rely heavily on the accurate segmentation results. Nowadays, manual segmentation is still required in many applications. Manual segmentation is not only time-consuming and tedious but also prone to human error. One of MCL members, Ruiyuan Lin, is working on this research topic.

Many methods have been proposed to automate the segmentation process, ranging from region growing and active contour models to multi-atlas segmentation. In our research work, we focus on the convolutional neural networks (CNN) based segmentation method. We attempted several segmentation networks such as fully convolutional networks (FCN) and residual networks, compared their performance with other methods, and analyzed the strengths and problems of the networks. We are planning to further explore the use of CNN on more complicated medical images such as cross-domain images.

Image credit: both images are modified from the MRI images in the Left Atrium Segmentation Challenge dataset:
Tobon-Gomez C, Geers AJ, Peters, J, Weese J, Pinto K, Karim R, Ammar M, Daoudi A, Margeta J, Sandoval Z, Stender B, Zheng Y, Zuluaga, MA, Betancur J, Ayache N, Chikh MA, Dillenseger J-L, Kelm BM, Mahmoudi S, Ourselin S, Schlaefer A, Schaeffter T, Razavi R, Rhode KS. Benchmark for Algorithms Segmenting the Left Atrium From 3D CT and MRI Datasets. IEEE Transactions on Medical Imaging, 34(7):1460–1473, 2015.

By |March 5th, 2017|News|Comments Off on MCL Works on Automatic Medical Image Segmentation with Convolutional Neural Networks|

MCL Works on User’s Experience on Head-Mounted VR Devices

Virtual Reality, or more precisely, the head-mounted-display (HMD) is becoming increasingly popular in recent years. With the release of consumer level products such as Oculus Rift and HTC Vive, it is no longer difficult for users to have a visit to the virtual world. Their fabulous immersive experience can always amaze the users when first played. However, adverse effects such as Motion Sickness are sometimes reported during the play. It is important to have a better understanding on these side effects.

Our research focuses on the qualitative and further quantitative measurement of the Motion Sickness in Virtual Reality. With the help of a better understanding on the reliable measurement on the Motion Sickness, we can not only control and even avoid this effect accordingly, but also develop a set of research paradigm to measure similar subjective feelings.

Currently, we have tried to proposed a physically sound, as well as practically feasible model, to explain and quantify the Motion Sickness in Virtual Reality. Our initial small-scale experiments have shown supportive evidence to our model. Though, will this model actually work on further experiments? Who knows. Maybe only the nature can tell. However, aren’t those endeavors to know more about the complex nature what research is about?

By |February 27th, 2017|News|Comments Off on MCL Works on User’s Experience on Head-Mounted VR Devices|

MCL Works on Road Detection for Autonomous Driving

Advanced driver assistance systems (ADAS) have attracted more and more attention nowadays, where various IT technologies are introduced to vehicles to enhance driving safety and automation.

MCL members, Junting Zhang and Yuhang Song, together with MediaTek Inc. have started a collaborative research project on ADAS-oriented deep learning technologies since January 2016. Single-image-based traffic scene segmentation and road detection have been studied extensively throughout 2016. We adapted the state-of-the-art general-purpose CNN architectures to urban scene semantic segmentation task, overcoming the cross-domain issue. On the other hand, computational and memory efficiency have always been our major concerns, we were also devoted to simplify the network structure and reduce redundant computation.

In 2017, we will explore the deep learning technologies for video processing. Although there are many interesting results in semantic urban scene understanding based on the CNN technology, semantic video understanding is still a challenging problem. We will try to find a semantic video understanding method that outperforms the single-image-based algorithms. To address this type of problems, we will exploit the temporal information.

By |February 19th, 2017|News|Comments Off on MCL Works on Road Detection for Autonomous Driving|

MCL Works on Drone Detection for Airport Safety

Nowadays, there is the growing popularity of commercial and recreation use of the drones which are the new threat to the airline safety. In Fall 2016, USC MCL, Inha University, Korean Air and Pratt & Whitney Institute for Collaborative Engineering (PWICE) started a joint research project to build a drone monitoring system to improve the airport security. USC MCL is held responsible for providing autonomous imaging-based drone monitoring system. One of MCL members, Yueru Chen, is working on this project.

On Thursday, February 9th, 2017, an midterm discussion of current projects was held between UTC Pratt-Whitney, Korean Airlines, USC, and Inha University. For the drone monitoring project, in attendance were Media Communications Lab’s Dr. Jay Kuo, Dr. Jongmoo Choi, Master’s student Pranav Aggarwal and Ph.D. student Yueru Chen, who presented their ongoing work on imaging-based drone monitoring system. USC MCL has developed comprehensive approaches including two modules, detection and tracking, with the use the deep learning methods. The proposed system was designed to detect illegal drones’ position and track their movement. During the meeting, USC MCL showed the promising results and discussed the future plan.

By |February 12th, 2017|News|Comments Off on MCL Works on Drone Detection for Airport Safety|
  • Permalink Gallery

    Congratulations to MCL Alumni for Joining Facebook, Google, Apple, and Bloomberg

Congratulations to MCL Alumni for Joining Facebook, Google, Apple, and Bloomberg

We would like to say congratulations to five of MCL Alumni–Chen Chen, Shangwen Li, Hao Xu, Jian Li, and Xiaqing Pan–for passing their PhD defenses and starting their career lives in such great companies of the U.S.. This year, we will have Chen and Shangwen joining Facebook, Hao joining Google, Jian joining Apple, and Xiaqing joining Bloomberg. We are so glad to have them to share their experience and advises for us.

Chen Chen:
I joined MCL 6 years ago as a master student. In the first seminar talk, Prof. Kuo’s sharing totally changed my view of being a researcher. It is the moment that I made the hard decision to become a PhD student. As expected, joining MCL creates unexpected benefits to my life. The precious platform provides many opportunities for me to learn and to grow. It allows me to meet many excellent students and scholars. Working with them is like visualizing myself in a mirror, which teaches me to know myself before growing. MCL also creates huge challenges to push me to the extreme. Every exam, project due and paper submission are companied with thousands of hours of rigorous and humble team work. “Aim high and act low” is the most impressive lesson I learned from the hard process.
It is the precious experience in MCL that offers the Facebook position at the end of my PhD. However, different from what I learned in MCL, I expect to gain more working experience and entrepreneurship in Facebook. I would also like to see more students from our lab can join Facebook to strengthen the MCL alumina team in the future.

Shangwen Li:
During my past internship at Facebook, I worked on a [...]

By |February 5th, 2017|News|Comments Off on Congratulations to MCL Alumni for Joining Facebook, Google, Apple, and Bloomberg|

Welcome New MCL Member Yuewei Na

We are so happy to welcome a new graduate member of MCL, Yuewei Na, in Fall 2016. Let’s give him a warm welcome. Here is a short interview with him.

1. Could you briefly introduce yourself? (Previous research experience, project experience, research interest and expertise)

Before coming to USC, I graduated from Xiamen University majoring computer science. I researched machine learning and low-level vision at Xiamen University. I also have experience in building fraud detection system for e-commerce company using Spark.

2. What’s your first impression of USC and MCL?

USC is very similar to Xiamen University, both of which are beautiful and good places to study. MCL is like a big warming family. I was deeply impressed by Prof. Kuo’s understanding on various topics in research.

3. What’s your future expectation for MCL?

Hope MCL can continuously produce ideas that are influential to the research community or even the whole society in the future. I’m glad to contribute my effort to this goal.


By |November 27th, 2016|News|Comments Off on Welcome New MCL Member Yuewei Na|

A Large-Scale Subjective Video Quality Database

The MCL joined a collaborative project to build a large-scale subjective video quality database. The database was proposed to boost a major breakthrough in video coding and processing. On one hand, revolutionary ideas rather than fine-tuning patches are highly expected to accommodate increasing video traffic. On the other hand, PSNR has been the dominant distortion metric for many years, but it has also been criticized for not correlating well with perceptual quality. With this database, perceptual coding is promising to lead to numerous R&D opportunities and revolutionary research with machine learning tools.

The database consists of 200 raw sequences with a duration of 5 seconds, encoded by H.264/AVC with fixed QP as bit rate control method. They are available in 5 resolutions from 3840×2160 to 540×360. Around 1000 students participated in the subjective test and it took around 7000 hours to get sufficient samples about 3 JND points. The database will be freely available for downloading for scientific purposes.

The projected was supported by 4 major multimedia companies, Netflix, Huawei, MediaTek, and Samsung. Meanwhile, 6 universities at Shenzhen joined the project, Shenzhen Institutes of Advanced Technology (Chinese Academy of Science), Shenzhen University, Graduate School at Shenzhen (Tsinghua University), Peking University Shenzhen Graduate School, The Chinese University of Hong Kong (Shenzhen), City University of Hong Kong.

We would like to give special thanks to the participating companies, institutes and universities.

By |November 20th, 2016|News|Comments Off on A Large-Scale Subjective Video Quality Database|