Welcome New MCL Member Yifan Wang

We are so glad to welcome our new MCL member, Yifan Wang! Here is a short interview with Yifan:

1. Could you briefly introduce yourself and your research interests?

My name is Yifan Wang, a graduate student in Department of Electrical and Engineering of USC. I received my bachelor’s degree in electrical engineering from Fudan University. When I was an undergraduate student, I learnt courses from multi-fields including digital and analog circuit, computer architecture etc. Among these, I found my interests lies in vision fields which is of vital importance to both human and machine.

2. What is your impression about MCL and USC?

I have been USC for two semesters. It is a really nice place for studying, since there are few entertainment sites outside school along with the super good sun shine prohibit me from going outside regularly. MCL is a large warn family where I met lots of talent people. It would be nice to make new friends and learn new things in MCL.

3. What is your future expectation and plan in MCL?

I would like to learn more theories and mathematics related to computer vision which would be the foundation of my future research. Besides, I would like to gain more practice in deep learning fields which are highly hardware relied.


By |June 9th, 2019|News|Comments Off on Welcome New MCL Member Yifan Wang|

MCL Research on Word Embedding

Word embeddings have been widely applied across several NLP tasks. The goal for word embedding is to transferring words into vector representations which embeds both syntactic and semantic information. General word embedding is usually generated by training on a large corpus like the whole wiki text data.

Our first work is mainly focus on improving the performance over trained word embedding models to make is more representative. The motivations are: (1) Even though current model are trained without considering the order of each dimension. But the obtain word embedding is usually carries a large mean and the variance is mostly lies on the first several principal components. This could lead hubness problem and we would like to analysis the statistics to make the whole space more iso-tropical. (2) The information of ordered input sequences is lost because of the context-based training scheme. From the above analysis, we proposed two ways to perform post-processing of word embedding call Post-processing via Variance Normalization (PVN) and Post-processing via Dynamic Embedding (PDE). The effectiveness of our model is verified over both intrinsic and extrinsic evaluation methods. For details, please refer to: [1].

During the past several years, word embedding is very popular, but the evaluation is mainly conducted over intrinsic evaluation methods because of their convenience. In Natural Language Processing society, we care about more the effective of word embedding on real NLP tasks like translation, sentiment analysis and question answering. Our second word focus on the word embedding quality and its relationship with evaluation methods. We have discussed criterions that a good word embedding should have and also for evaluation methods. Also, the properties of intrinsic evaluation methods are discussed because different intrinsic evaluator tests from different perspectives. Finally, [...]

By |June 2nd, 2019|News|Comments Off on MCL Research on Word Embedding|

MCL Research on Point Cloud Classification

With the rise of visualization, animation and autonomous driving applications, the demand for 3D point cloud analysis and understanding has rapidly increased. Point Cloud is a kind of data obtained from lidar scanning which contains abundant 3D information. ModelNet40 is a point cloud dataset contains 40 classes of objects. In this project, we use ModelNet40 dataset for the analysis and evaluation of point cloud classification. Many of the recent works focus on developing end to end algorithm like other convolutional neural networks for images. However, object and scene understanding with Convolutional Neural Networks (CNNs) on 3D volumetric data is still limited due to its high memory requirement and computational cost. For some simple tasks like classification, this method is too much.

An interpretable CNN design based on the feedforward (FF) methodology [1] without any backpropagation (BP) was recently proposed by the Media Communications Lab at USC. The classification baseline is composed by four Saab units, each unit contains KNN query, space grouping and Saab transform, and between units we use farthest sampling to improve efficiency. We are still working on it to improve as much as possible. Our goal is to catch up with the state-of-the-art results and show that FF design is powerful and useful. The advantages of the FF design methodology are multiple folds. It is completely interpretable. It demands much less training complexity and training data. Furthermore, it can be generalized to weakly supervised or unsupervised learning scenarios in a straightforward manner. The latter is extremely important in real world application scenarios since data labeling is very tedious and expensive.

The advantages of the FF design methodology are multiple folds. It is completely interpretable. It demands much less training complexity and training data. [...]

By |May 26th, 2019|News|Comments Off on MCL Research on Point Cloud Classification|

MCL Members Attended the PhD Hooding Ceremony

Nine MCL members attended the Viterbi PhD hooding ceremony on Thursday, May 9, 2019, from 8:30-11:00 a.m. in the Bovard Auditorium. They were Fenxiao Chen, Yueru Chen, Ronald Salloum, Yuhang Song, Yuanhang Su, Ye Wang, Chao Yang, Heming Zhang, and Junting Zhang. Congratulations to them for their accomplishments in completing their PhD program at USC!

Fenxiao (Jessica) Chen received the B.S. degree in General Engineering from Harvey Mudd College, Claremont, CA in 2014. She then continued with her PhD in Media Communications Lab at USC from 2017. Her research interests include natural language processing and deep learning.

Yueru Chen received her Bachelor’s degree in Physics from the University of Science and Technology of China in June 2014. Since 2015, she joined MCL for the PhD study. Her thesis topic is “Object Classification based on Neural-network-inspired Image Transforms”, where she focuses on solving the image classification problem based on the neural-network-inspired Saak transform and Saab transform.

Ronald Salloum received his B.S. degree in Electrical Engineering from California State Polytechnic University, Pomona, and his Ph.D. degree in Electrical Engineering from University of Southern California (USC). The title of his dissertation is “A Data-Driven Approach to Image Splicing Localization.” His research interests include multimedia forensics, machine learning, and biometrics.

Yuhang Song received his Bachelor’s degree in Electronic Engineering from Tsinghua University, Beijing in 2014. He then joined MCL to pursue Ph.D. degree in Electrical Engineering at USC from 2015. His research interests include deep generative models, image generation, visual relationship detection, and visual understanding.

Yuanhang Su received his Ph.D. at the University of Sothern California (USC) in computer vision, natural language processing and machine learning. He received M.S. degree from the USC in 2010 and the dual B.S. degree from the University [...]

By |May 19th, 2019|News|Comments Off on MCL Members Attended the PhD Hooding Ceremony|

Congratulations to Harry Yang for Passing His Defense!

Congratulations to Harry Yang for passing his defense on May 7, 2019! Let us hear what he would like to say about his defense and an abstract of his thesis.

“In the thesis, we tackle the problem of translating faces and bodies between different identities without paired training data: we cannot directly train a translation module using supervised signals in this case. Instead, we propose to train a conditional variational auto-encoder (CVAE) to disentangle different latent factors such as identity and expressions. In order to achieve effective disentanglement, we further use multi-view information such as keypoints and facial landmarks to train multiple CVAEs. By relying on these simplified representations of the data we are using a more easily disentangled representation to guide the disentanglement of image itself. Experiments demonstrate the effectiveness of our method in multiple face and body datasets. We also show that our model is a more robust image classifier and adversarial example detector comparing with traditional multi-class neural networks.

“To address the issue of scaling to new identities and also generate better-quality results, we further propose an alternative approach that uses self-supervised learning based on StyleGAN to factorize out different attributes of face images, such as hair color, facial expressions, skin color, and others. Using pre-trained StyleGAN combined with iterative style inference we can easily manipulate the facial expressions or combine the facial expressions of any two people, without the need of training a specific new model for each of the identity involved. This is one of the first scalable and high-quality approach for generating DeepFake data, which serves as a critical first step to learn a more robust and general classifier against adversarial examples.”

Harry also shared about his Ph.D. experience:

“Firstly, I would [...]

By |May 12th, 2019|News|Comments Off on Congratulations to Harry Yang for Passing His Defense!|

Welcome New MCL Member Dr. Na Li

We are so glad to welcome our new MCL member, Dr. Na Li!

Dr. Li is an Assistant Professor at Shenzhen Institute of Advanced Technology (SIAT), Chinese Academy of Sciences (CAS). Currently, she is a visiting scholar at MCL in USC, under the supervision of Prof. C.-C. Jay Kuo. Here is a short interview with Dr Li:

1. Could you briefly introduce yourself and your research interests?

I’m Na Li, Ph.D., an Assistant Professor at Shenzhen Institute of Advanced Technology (SIAT), Chinese Academy of Sciences (CAS). In 2009, I graduated from Hunan University with Bachelor degree on Computer Science and Technology, Changsha. In 2014, I received Ph.D. degree at Institute of Automation (IA), Chinese Academy of Sciences, Beijing. Since 2014, I join SIAT CAS at Shenzhen. I was doing internship as a research associate in High Performance Department in SAP China (Beijing) from Sep. 2013 to Nov. 2013. I am a big fun of Mobius Band. I mainly focus on intelligent video processing and analysis. My research interest including video coding, crowd behavior analysis, reinforcement learning, optimization, scheduling algorithm and related fields.

2. What is your impression about MCL and USC?

MCL is a big family guided by a big father. It consists of a bunch of guys with great mind. I like the seminar held in every Friday very much, which is full of sharing of food, knowledge and experiences. Everyone in MCL are trying their best to be excellent and contribute to the fields of both academy and industry. I am impressed by the way MCL and USC are well organized. The education system of USC is strong enough to build up great Trojans.

3. What is your future expectation and plan in MCL?

I am looking forward to [...]

By |May 5th, 2019|News|Comments Off on Welcome New MCL Member Dr. Na Li|

MCL Research on Multi-model Neural Machine Translation

Our long-term goal is to build intelligent systems that can perceive their visual environment and understand the linguistic information, and further make an accurate translation inference to another language. However, most multi-modal translation algorithms are not significantly better than an off-the-shelf text-only machine translation (MT) model. There remains an open question about how translation models should take advantage of visual context, because from the perspective of information theory, the mutual information of two random variables I(X; Y) will always be no greater than I(X, Z; Y) where Z is the additional visual input. This conclusion makes us believe that the visual content will hopefully help the translation systems.

Since the standard paradigm of multi-modal translation always considers the problem as a supervised learning task, the parallel corpus is usually sufficient to train a good translation model, and the gain from the extra image input is very limited. We however argue that the text-only UMT is fundamentally an ill-posed problem, since there are potentially many ways to associate target with source sentences. Intuitively, since the visual content and language are closely related, the image can play the role of a pivot “language” to bridge the two languages without paralleled corpus, making the problem “more well-defined” by reducing the problem to supervised learning.

We tackle the unsupervised translation with a multi-modal framework which includes two sequence-to-sequence encoder-decoder models and one shared image feature extractor in order to achieve the unsupervised translation. We employ transformer in both the text encoder and decoder of our model and design a novel joint attention mechanism to simulate the relationships among the language and visual domains.

Succinctly, our contributions are three-fold:

We formulate the multi-modal MT problem as unsupervised setting that fits the real [...]

By |April 28th, 2019|News|Comments Off on MCL Research on Multi-model Neural Machine Translation|

Happy New Year 2019!

In 2018, several students graduated from MCL with impressive work and started a new journey of life. Meanwhile, many new blood joined our group and enjoyed a wonderful time exploring in their research areas. In this year, MCL members kept moving forward in research and published high quality papers on top journals and conferences. Year 2018 has been a fruitful year for us.

Now we are standing at the end of 2018. Wish all members have a happy new year and a more wonderful 2019!


Image credits:

Image 1: http://www.traderstrustedacademy.com/category/happy-new-year-2019-hd-images/, cropped and resized with white padding; Image 2: http://www.hdnicewallpapers.com/Wallpaper-Download/New-Year/Happy-New-Year-Red-Rose, cropped and resized with white padding.

By |December 31st, 2018|News|Comments Off on Happy New Year 2019!|

Merry Christmas!

May your Christmas sparkle with moments of love, laughter and goodwill. And may the year ahead be full of contentment and joy. Wish all our fellows a Merry Christmas!

By |December 24th, 2018|News|Comments Off on Merry Christmas!|

MCL Research on Domain Adaptation

Trained deep learning models do not generalize well if the testing data has a different distribution from the training data set. For instance, in medical image segmentation, the MRI and CT scan of the same object look very different. If we simply train a model on the MRI scans, it is very likely that the model will not work on the CT scans. However, it is very expensive and time-consuming to manually label different data sets. Therefore, we wish to transfer the knowledge from a labeled training set to an unlabeled testing data with a different distribution. Domain adaptation can help us achieve this purpose.
Domain adaptation can be categorized into three types based on the availability of target domain data: supervised, semi-supervised, unsupervised [1]. In supervised domain adaptation, a limited amount of labeled target domain data is available. In the semi-supervised setting, unlabeled target domain data as well as a small amount of labeled target domain data is available. In the unsupervised setting, only unlabeled target domain data is available. Unsupervised domain adaptation is an ill-posed problem since we do not have labels for the target domain data. Proper assumptions on the target domain data are important for performing unsupervised domain adaptation. In our research, we focus on the unsupervised domain adaptation. Unsupervised domain adaptation can be applied to many computer vision problems, including classification, segmentation, and detection. Currently, we focus our experiment on classification.
–By Ruiyuan Lin


[1] M. Wang and W. Deng, “Deep visual domain adaptation: A survey,” Neurocomputing, 2018.
Image Credits:
Anon, (2018). Available at: http://ai.bu.edu/visda-2018/assets/images/domain-adaptation.png [Accessed 16 Dec. 2018].
X. Peng,  B. Usman,  N. Kaushik,  J. Hoffman, D.  Wang, and K. Saenko, “Visda:  The visual domain adaptation challenge,” 2017.

By |December 16th, 2018|News|Comments Off on MCL Research on Domain Adaptation|