USC Media Communications Lab

Permalink Gallery
Congratulations to Ron Salloum for Passing His Defense

Congratulations to Ron Salloum for Passing His Defense

Congratulations to Ron Salloum for passing his PhD defense on April 10, 2019. His PhD thesis is entitled “A Data-Driven Approach to Image Splicing Localization”.

Abstract:

The availability of low-cost and user-friendly editing software has made it significantly easier to manipulate images. Thus, there has been an increasing interest in developing forensic techniques to detect and localize image manipulations or forgeries. Splicing, which is one of the most common types of image forgery, involves copying a region from one image (referred to as the donor image) and pasting it onto another image (referred to as the host image). Forgers often use splicing to give a false impression that there is an additional object present in the image, or to remove an object from the image.

Many of the current splicing detection methods only determine whether a given image has been spliced and do not attempt to localize the spliced region. Relatively few methods attempt to tackle the splicing localization problem, which refers to the problem of determining which pixels in an image have been manipulated as a result of a splicing operation.

In my dissertation, I present two different splicing localization methods that we have developed. The first is the Multi-task Fully Convolutional Network (MFCN), which is a neural-network-based method that outperforms previous methods on many datasets. The second proposed method is based on cPCA++ (where cPCA stands for contrastive Principal Component Analysis), which is a new data visualization and clustering technique that we have developed. The cPCA++ method is more efficient than the MFCN and achieves comparable performance.

PhD Experience:

Pursuing my PhD degree was a very challenging but rewarding experience. I really enjoyed my time in the Media Communications Laboratory and had the opportunity to work on exciting research projects. [...]

By Xuejing Lei|April 16th, 2019|News|Comments Off|

Permalink Gallery
MCL Research on Fake Image Detection

MCL Research on Fake Image Detection

With the rapid development of image processing technology, generating an image without obvious visual artifacts becomes much easier. Progressive GAN have generated high resolution images which can almost fool human eyes. In this case, fake image detection is a must. Currently, convolutional neural network based method is tested by many researchers to do GAN image detection. They build deeper and deeper network, such as XceptionNet, in order to have higher ability to distinguish real and fake. These CNN-based methods have achieved very high accuracy of more than 99%.
We want to build an interpretable method compared to others, with no back-propagation, and aims to achieve similar accuracy. In our method, we first detect 68 facial landmarks from both real and fake images. Then extract 32*32 patches which are centered by the 68 facial landmarks. Those patches together with their label, will be fed into two layers’ Saab architecture. After two fully connected layers, the probability of the patches being fake or real will be stored in 2 by 1 output vector. For 68 facial landmarks, we train 68 models. The output 68 2 by 1 vectors will be fed into a SVM classifier, and output the decision of whether the whole training image will fake or real.
Author: Yao Zhu

By Xuejing Lei|April 8th, 2019|News|Comments Off|

Permalink Gallery
MCL Research on Point-cloud Analysis

MCL Research on Point-cloud Analysis

With the rise of visualization, animation and autonomous driving applications, the demand for 3D point cloud analysis and understanding has rapidly increased. Point Cloud is a kind of data obtained from lidar scanning which contains abundant 3D information. Our research directions about point cloud in autonomous driving are object detection, segmentation and classification.

Due to its unstructured and unordered properties, people usually transfer point cloud into other data types such as mesh, voxel and multi-view. But the transformation must cause information lost. Recently, several deep-learning-solutions such as PointNet/Pointnet++ [1, 2] tailored to point clouds provide a more efficient and flexible way to handle 3D data. Some successful results for object classification and parts and semantic scene segmentation have been demonstrated. However, object and scene understanding with Convolutional Neural Networks (CNNs) on 3D volumetric data is still limited due to its high memory requirement and computational cost. This brings a challenge for autonomous driving since it requires real-time and concise processing of the observed scenes and objects.

An interpretable CNN design based on the feedforward (FF) methodology [3] without any backpropagation (BP) was recently proposed by the Media Communications Lab at USC. The FF design offers a complementary approach to CNN filter weights selection. We are now designing a feed-forward (FF) network for both object classification and indoor scene segmentation. The advantages of the FF design methodology are multiple folds. It is completely interpretable. It demands much less training complexity and training data. Furthermore, it can be generalized to weakly supervised or unsupervised learning scenarios in a straightforward manner. The latter is extremely important in real world application scenarios since data labeling is very tedious and expensive.

References:

R. Qi, H. Su, K. Mo, and L. J. Guibas. [...]

By Xuejing Lei|April 1st, 2019|News|Comments Off|

Permalink Gallery
MCL Research on Domain Adaptation

MCL Research on Domain Adaptation

Domain Adaptation is a sort of transfer learning, which is aimed to learn a model from source data distribution and apply to the target data of different distribution. Basically, the tasks in source and target domains are the same, such as both are image classification task or both are image segmentation task. There are three types of domain adaptation, differing in how many target samples are labeled with ground truth labels. In the supervised domain adaptation and the semi-supervised domain adaptation, all or part of target data is labeled respectively, while all target data is unlabeled in the unsupervised domain adaptation.

There are several classical methods supposed to solve domain shift problems by feature alignment in the unsupervised domain adaptation. [1] maps data of source and target domains into one subspace learned by reducing the distribution distance measured by maximum mean discrepancy. [2] aligns eigenvectors of two domains by learning a linear mapping function. [3] utilizes geometric and statistical changes between source and target domain to build an infinite number of subspaces and integrates them together. With the increasing popularity of deep learning, there are plenty of methods[4,5,6] utilize CNN or GAN in domain adaptation. But those methods demand a high computation cost due to back-propagation and GAN related methods are unstable in training. Besides, generalizability from one domain to the other is weak in deep learning based methods.

Professor Kuo proposed several explanations on explainable deep learning since 2014. The Saak and Saab transform gives a way to extract feature representation of images and original images can be reconstructed from the feature representation through inverse transform. This gives us a new way to handle domain adaptation task. We are now working on aligning Saab features [...]

By Xuejing Lei|March 25th, 2019|News|Comments Off|

Permalink Gallery
MCL Research on Active Learning

MCL Research on Active Learning

Deep learning has shown its effectiveness in various computer vision tasks. However, a large amount of labeled data is usually needed for deep learning approaches. Active learning can help reduce the labeling efforts by choosing the most informative samples to label and thus achieves a comparable performance with less labeled data.

There are two major types of active learning strategy: uncertainty based and diversity based.

The core idea of uncertainty based methods is to label those samples that are most uncertain to the existing model trained on current labeled set. For example, an image with a prediction of 50 percent cat is empirically considered to be more valuable than an image with a prediction of 99 percent cat, where the former has larger uncertainty. Besides uncertainty metrics from information theory like entropy, Beluch et al. [1] proposes to use an ensemble to estimate the uncertainty of unlabeled images and achieves good results in ImageNet dataset.

In contrast, diversity based methods rely on an assumption that a more diverse set of images chosen as the training set can lead to better performance. Sener et al. [2] formalizes the active learning problem into a core-set problem and achieves competitive performance in CIFAR-10 dataset. Mixed-integer programming is used to solve their objective function.

Our current research focuses on balancing the two factors (uncertainty and diversity) in a explainable way.

References:
[1] Beluch, William H., et al. “The power of ensembles for active learning in image classification.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
[2] Sener, Ozan, and Silvio Savarese. “Active learning for convolutional neural networks: A core-set approach.” (2018).

Author: Yeji Shen

By Xuejing Lei|March 18th, 2019|News|Comments Off|

Permalink Gallery
Professor Kuo received IEEE Computer Society 2019 Edward J. McCluskey Technical Achievement Award

Professor Kuo received IEEE Computer Society 2019 Edward J. McCluskey Technical Achievement Award

Dr. C.-C. Jay Kuo, Distinguished Professor of Electrical Engineering and Computer Science, and the Director of the Multimedia Communications Laboratory at USC, has been selected to receive the IEEE Computer Society 2019 Edward J. McCluskey Technical Achievement Award, for “outstanding contributions to multimedia computing technologies and their applications.”

Professor Kuo is a world-renowned technical leader in multimedia computing technologies, systems and applications with an enduring impact on both academic and industry realms in the last three decades. He has made seminal contributions to video coding technologies in three areas: fast motion search, H.264 rate control, and perceptual coding. Professor Kuo’s deblocking filter and rate control technologies are widely used in video capturing devices such as smart phone cameras. Furthermore, he conducted extensive work in applying wavelets to image processing such as texture analysis, curve representation, fractal analysis, watermarking and data hiding. Recently, he has focused on machine learning, artificial intelligence, and computer vision and has developed a mathematical model that shed light on the mysterious behavior of deep learning networks.

Professor Kuo said, “It is a great honor to be named as the recipient of the IEEE Computer Society 2019 Edward J. McCluskey Technical Achievement Award. There are many outstanding researchers in the field, and I am truly humbled for this recognition.”.

For more, please click https://www.computer.org/press-room/2019-news/2019-edward-j-mccluskey-technical-achievement-award-c-c-jay-cuo

By Xuejing Lei|March 11th, 2019|News|Comments Off|

Permalink Gallery
MCL Research on Explainable Deep Learning

MCL Research on Explainable Deep Learning

The deep learning technologies such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs) have great impacts on modern machine learning due to their impressive performance in many application fields that involve learning, modeling, and processing of complex sensing data. Yet, the working principle of deep learning remains mysterious. Furthermore, it has several well-known weaknesses: 1) vulnerability to adversarial attacks, 2) demanding heavy supervision, 3) generalizability from one domain to the other. Professor Kuo and his PhD students at Media Communications Lab (MCL) have been working on explainable deep learning since 2014 and published a sequence of pioneering papers on this topic.

Explanation of nonlinear activation, convolutional filters and discriminability of trained features of CNNs [1]-[3]. The role of CNN’s nonlinear activation function is well explained in [1] at the first time. That is, the nonlinear activation operation is used to resolve the sign confusion problem due to the cascade of convolutional operations in multiple layers. This work received the 2018 best paper award from the Journal of Visual Communication and Image Representation. The convolutional filters is viewed as a rectified correlations on a sphere (RECOS) and CNN’s operation is interpreted as a multi-layer RECOS transform in [2]. The discriminability of trained features of a CNN at different convolution layers is analyzed using two quantitative metrics in [3] – the Gaussian confusion measure (GCM) and the cluster purity measure (CPM), The analysis is validated by experimental results.

Saak transform and its application to adversarial attacks [4]-[5]. Being inspired by deep learning, we develop a new mathematical transform called the Saak (Subspace approximation with augmented kernels) transform in [4]. The Saak and inverse Saak transforms provide signal analysis and synthesis tools, respectively. CNNs are known to [...]

By Xuejing Lei|March 4th, 2019|News|Comments Off|

Permalink Gallery
MCL Research on CNN Ensembles

MCL Research on CNN Ensembles

CNN technology provides state-of-the-art solutions to many image processing and computer vision problems. Given a CNN architecture, all of its parameters are determined by the stochastic gradient descent (SGD) algorithm through backpropagation (BP). The BP training demands a high computational cost. Furthermore, most CNN publications are application-oriented. There is a limited amount of progress after the classical result in [1]. Examples include explainable CNNs [2,3,4] and feedforward designs without backpropagation [5,6].

The determination of CNN model parameters in the one-pass feedforward (FF) manner was recently proposed by Kuo et al. in [6]. It derives network parameters of a target layer based on statistics of output data from its previous layer. No BP is used at all. This feedforward design provides valuable insights into the CNN operational mechanism. Besides, under the same network architecture, its training complexity is significantly lower than that of the BP design CNN. FF-designed and BP-designed CNNs are denoted by FF-CNNs and BP-CNNs, respectively.

We focus on solving the image classification problem based on the feedforward-designed convolutional neural networks (FF-CNNs) [6]. An ensemble method that fuses the output decision vectors of multiple FF-CNNs to solve the image classification problem is proposed. To enhance the performance of the ensemble system, it is critical to increasing the diversity of FF-CNN models. To achieve this objective, we introduce diversities by adopting three strategies: 1) different parameter settings in convolutional layers, 2) flexible feature subsets fed into the Fully-connected (FC) layers, and 3) multiple image embeddings of the same input source. Furthermore, we partition input samples into easy and hard ones based on their decision confidence scores. As a result, we can develop a new ensemble system tailored to hard samples to further boost classification accuracy. Although [...]

By Wei Wang|February 26th, 2019|News|Comments Off|

Permalink Gallery
Welcome New MCL Member Haoxuan You

Welcome New MCL Member Haoxuan You

Welcome New MCL Member Haoxuan You! Here is an short interview with Haoxuan:

1. Could you briefly introduce yourself and your research interests?

My name is Haoxuan You. I received my bachelor’s degree at Xidian University in China last July and spent my gap year at Tsinghua University as a research assistant. Now I am having my six-month visiting at USC. Generally speaking, my research interest lies in computer vision and machine learning. I hope to model the 3D environment by effectively 3D data processing.

2. What is your impression about MCL and USC?

The first impression of USC to me is the sunshine and beautiful campus. Then I found MCL a warm and big family with very kind people. The seminar lunch, the group meeting all give me a good opportunity to get involved in and help me to broaden my horizon. I feel really lucky to be with these lovely friends.

3. What is your future expectation and plan in MCL?

I hope to have a better understanding of how to be a good researcher and independent thinker from Prof. Kuo and other MCL members. And I wish to do some more valuable work such as the interpretable network in 3D data processing and 3D object detection.

By Wei Wang|February 16th, 2019|News|Comments Off|

Permalink Gallery
Welcome New MCL Member Prof. Yongfei Zhang

Welcome New MCL Member Prof. Yongfei Zhang

Welcome new MCL member Prof. Yongfei Zhang! Prof. Zhang now is a visiting scholar at Media Communications Lab (MCL) at University of Southern California (USC) under the supervision of Prof. C.-C. Jay Kuo. He is an Associate Professor with the Beijing Key Laboratory of Digital Media, School of Computer Science and Engineering, and the State Key Laboratory of Virtual Reality Technology and Systems, Beihang University, Beijing, China. He was a visiting scholar at University of Missouri, Columbia and Sept. 2007 to Sept. 2009.

Here is a short interview with Prof. Zhang:

1. Could you briefly introduce yourself and your research interests?

I am Yongfei Zhang, an Associate Professor in Beihang University. I received my B.S. degree in Electrical Engineering and Ph.D. degree in Pattern Recognition and Intelligent Systems, both from Beihang University, Beijing, China, in 2005 and 2011 respectively. My current research interests include Real-time High Efficiency Video Coding/Decoding, Computer Vision (Image Retrieval, Vehicle/Person Re-ID), Construction & Referencing of Domain-Specific Knowledge Graph.

2. What is your impression about MCL and USC?

MCL is a large and warm group with great director and students. Group routine activities, like individual meetings and weekly group seminars, are clearly scheduled and well organized with the contributions of each MCL member, which make the research and work very efficient. Besides, Prof. Kuo’s passion for the academic research and group culture development impress me a lot.

3. What is your future expectation and plan in MCL?

Great thanks to Prof. Kuo for hosting me in the big family of MCL. I wish I could enhance myself, with the help and guidance of Prof. Kuo, on teaching, research, and group management. Also, I’d like to make friends with all MCL members. Academically, I wish I could proceed/cooperate on [...]

By Wei Wang|February 9th, 2019|News|Comments Off|

Previous 31 323334 35 Next

News