USC Media Communications Lab

Permalink Gallery
MCL Research on Negative Sampling for Knowledge Graph Learning

MCL Research on Negative Sampling for Knowledge Graph Learning

A knowledge graph is a collection of factual triples (h, r, t) consisting of two entities and one relation. Most knowledge graphs suffer from the incompleteness that there are many missing relations between entities. To predict missing links, each relation is modeled by a binary classifier to predict whether the links between two entities exist or not. Negative sampling is a task to draw negative samples efficiently and effectively from the unobserved triples to train the classifiers. The quality and quantity of the negative samples will highly affect the performance on link prediction.
Naive negative sampling [1] suggests generating negative samples by corrupting one of the entities in the observed triples, e.g. (h’, r, t) or (h, r, t’). Despite the simplicity of naive negative samples, the generated negative samples carry little semantics. For example, given a positive triple (Hulk, movie_genre, Science Fiction), a negative sample (Hulk, movie_genre, New York City) might be generated by naive negative sampling, which will never be a valid triple in the real-world scenario. Instead, we are looking for negative examples, such as (Hulk, movie_genre, Romance), that provide more information to the classifiers. Based on the observation, we only draw the corrupted entities within the set of observed entities that have been linked by the given relation, also known as the ‘range’ for the relation. However, a drawback is that the chance of drawing false negatives is high. Therefore, we further filter the drawn corrupted entities based on the entity-entity co-occurrence. For example, it’s not likely for us to generate a negative sample (Hulk, movie_genre, Adventure) because we know from the dataset that movie genres ‘Science Fiction’ and ‘Adventure’ are highly co-occurred. The first figure shows how the positive [...]

By Zhiruo Zhou|October 17th, 2021|News|Comments Off|

Permalink Gallery
MCL Research on GAN-generated Fake Images Detection

MCL Research on GAN-generated Fake Images Detection

In recent years, there has been a rapid development of image synthesis techniques based on convolutional neural networks (CNNs), such as the variational auto-encoder (VAE) and generative adversarial networks (GANs). They have shown their ability to generate realistic images that are hard for people to tell which is fake and which is real. Most state-of-the-art CNN generated image detection methods are formulated on deep neural networks. However, their performance can be easily restrained on specific fake image datasets and fail to generalize well to other datasets.

We propose a new CNN-generated-image detector, named Attentive PixelHop (or A-PixelHop). A-PixelHop is designed under the assumption that it is difficult to synthesize high-quality high-frequency components in local regions. Specifically, we first select edge/texture blocks that contain significant high frequency components, then apply multiple filter banks to them to obtain rich sets of spatial-spectral responses as features. Different filter bank features may have different importance on the deciding fake and real, therefore, we feed features to multiple binary classifiers to obtain a set of soft decisions, and we only select the ones with highest discrimination ability. Finally, we develop an effective ensemble scheme to fuse the soft decisions from more discriminant channels into the final decision. System design is shown in Figure 1 below. Compared with CNN-based fake image detection methods, our method has low computational complexity and a small model size, high detection performance against a wide range of generative models, and mathematical transparency since. Experimental results show that A-PixelHop outperforms all state-of-the-art benchmarking methods for CycleGAN-generated images, see Table 1. Furthermore, it can generalize well to unseen generative models and datasets, see Table 2.

By Zhiruo Zhou|October 3rd, 2021|News|Comments Off|

Permalink Gallery
MCL Research on Point Cloud Registration

MCL Research on Point Cloud Registration

3D registration is an important step in point cloud processing. Given a set of point cloud scans, registration tries to align the point clouds in one reference frame so as to get the complete 3D scene of the environment. In a simple case, given two point clouds, usually referred to as source and target, the registration algorithm finds an optimal 3D transformation that aligns the source with the target. The 3D transformation consists of rotation and translation.

The classical Iterative Closest Point (ICP) algorithm and its variants have been a popular choice for registration since many years. More recently learning-based methods have been developed for point cloud registration. These methods have resolved some issues related to traditional methods such as noise resilience, outliers, difference in sampling densities, partial views, etc. But in turn, most of these methods rely on supervision in terms of ground truth rotation matrix and translation vector.

Inspired by the Successive Subspace Learning methodology and the PointHop classification method in particular, we propose an unsupervised point cloud registration method called R-PointHop [1]. R-PointHop first finds a local reference frame (LRF) for every point using its nearest neighbors and determines its local attributes. Next, it learns local-to-global hierarchical features by point downsampling, neighborhood expansion, attribute construction and dimensionality reduction steps. Then, point correspondence are found using nearest neighbor rule in the hierarchical feature space. Later, a subset of good correspondence is selected to estimate the 3D transformation. The use of LRF allows for the point features to be invariant with respect to rotation and translation, thus making R-PointHop more robust even in presence of large rotation angles. Experiments on the ModelNet40 and the Stanford Bunny dataset demonstrate the effectiveness of R-PointHop on the 3D [...]

By Zhiruo Zhou|August 15th, 2021|News|Comments Off|

Permalink Gallery
MCL Research on Image Generation

MCL Research on Image Generation

An image generative model learns the distribution of image samples from a certain domain and then generates new images that follow the learned distribution. The design of image generative models involves analysis and generation two pipelines. The former analyzes properties of training image samples while the latter generates new images after the training is completed. Only the generation unit is used for image generation in inference. There is a resurge of interests in generative models due to the amazing performance achieved by deep-learning-based (DL-based) methods in general and generative adversarial networks (GANs) in particular. Yet, DL-based methods attempt to solve a nonconvex optimization problem which is difficult to explain. GAN’s training may suffer from gradient vanishing, convergence difficulty and mode collapse. Furthermore, its implementation demands higher computational resources due to large model sizes.

The design of GANs demands that distributions of training and generated images be indistinguishable, which is implicitly achieved by training a generator/discriminator pair through end-to-end optimization of a cost function. In contrast with the GAN approach, we propose a novel and explainable image generation method with explicit sample distribution modeling in this work. For image analysis, we construct fine-to-coarse spatial-spectral subspaces using the PixelHop++ architecture and obtain sample distributions in each subspace. For image generation, we reverse the process by generating samples in the coarsest subspace and add more details to these samples gradually. Our solution, called GenHop (an acronym of Generative Pixelhop), offers an unconditional generative model. Based on MNIST and Fashion-MNIST two datasets, GenHop can generate visually pleasant images whose FID scores are comparable with those of DL-based generative models.

By Zhiruo Zhou|August 8th, 2021|News|Comments Off|

Permalink Gallery
MCL Research on Large-scale 3D Indoor Scene Semantic Segmentation

MCL Research on Large-scale 3D Indoor Scene Semantic Segmentation

3D point cloud segmentation requires the understanding of both the global geometric structure and the fine-grained details of each point. According to the segmentation granularity, 3D point cloud segmentation methods can be classified into three categories: semantic segmentation (scene level), instance segmentation (object level) and part segmentation (part level). Our research focuses on solving the semantic segmentation problem. Given a point cloud, the goal of semantic segmentation is to separate it into several subsets according to the semantic meanings of points. There are two common scenes, urban scene and indoor scene. Efficient semantic segmentation of large-scale 3D point clouds is a fundamental and essential capability for real-time intelligent systems, such as autonomous driving and augmented reality. We mainly try to do the large-scale 3D indoor scene semantic segmentation more efficiently. The representative large-scale public is the indoor S3DIS dataset [1].

A key challenge is that the raw point clouds acquired by depth sensors are typically irregularly sampled, unstructured and unordered. Recently, the pioneering work PointNet [2] was proposed for directly processing 3D point clouds by learning per-point features using shared multilayer perceptrons (MLPs) and max pooling. Its following works try to capture wider context information for each point. Although these approaches achieve impressive results for object recognition and semantic segmentation, almost all of them are limited to extremely small 3D point clouds and cannot be directly extended to larger scale without preprocessing such as block partition.

We design a different data preprocessing method to learn large scale data directly. Each room in the dataset is treated as an input sample and feed into an unsupervised feature extractor to obtain point-wise features. The unsupervised feature extractor is developed upon our previous work PointHop [3]. It is extremely [...]

By Zhiruo Zhou|August 1st, 2021|News|Comments Off|

Permalink Gallery
MCL Research on Image Steganalysis

MCL Research on Image Steganalysis

In image forensics, steganography and steganalysis are like the two ends of the same coin. image steganography is a technique to conceal secret messages in the images by slightly modifying the pixel values. Corresponding to image steganography, steganalysis is the process to reveal the presence of the hidden message in images. Recently, steganalysis are focusing on defending content-adaptive steganographic schemes, for example WOW, HILL and S-UNIWARD, etc. Fig.1 [1] illustrates the modifications of cover image from different steganographic method. Content-adaptive steganography is lean to do modifications on complex texture regions, which makes embedding traces less detectable for steganalyzers.

Traditionally, hand crafted features together with machine learning classifiers have good performance on steganalysis, such as Spatial Rich Model and its variants. After the emerging of neural networks, different CNN architectures are utilized in the steganalysis literature. Because of the important property that CNNs are able to extract complex statistical dependencies from high dimensional input and learn hierarchical representations, CNN-based features usually achieve better performance than traditional hand-crafted features. However, CNN based models are suffering from long training time, large model size and enormous consumption of computation resources.

We would like to utilize green learning methodology in steganalysis field, by incorporating Saab transform as feature extraction module in the future. Saab transform has shown its capability of extracting the high frequency representations in a feedforward way and preserving the light-weighted model size at the same time.

References:

[1] Tang, Weixuan, et al. “Adaptive steganalysis based on embedding probabilities of pixels.” IEEE Transactions on Information Forensics and Security 11.4 (2015): 734-745.

[2] C.-C. J. Kuo and Y. Chen, “On data-driven saak transform,” Journal of Visual Communication and Image Representation, vol. 50, pp. 237–246, 2018.

[3] C.-C. J. Kuo, M. Zhang, S. Li, J. [...]

By Zhiruo Zhou|July 25th, 2021|News|Comments Off|

Permalink Gallery
MCL Research on Object Tracking

MCL Research on Object Tracking

Video object tracking is one of the fundamental computer vision problems and has found rich applications in video surveillance, autonomous navigation, robotics vision, etc. In the setting of online single object tracking (SOT), a tracker is given a bounding box on the target object at the first frame and then predicts its boxes for all remaining frames. Online tracking methods can be categorized into two categories, unsupervised and supervised. Traditional trackers are unsupervised. Recent deep-learning-based (DL-based) trackers demand supervision. Unsupervised trackers are attractive since they do not need annotated boxes to train supervised trackers. The performance of trackers can be measured in terms of accuracy (higher success rate), robustness (automatic recovery from tracking loss), and speed (higher FPS).

We examine the design of an unsupervised high-performance tracker and name it UHP-SOT (Unsupervised High-Performance Single Object Tracker) in this work. UHP-SOT consists of three modules: 1) appearance model update, 2) background motion modeling, and 3) trajectory-based box prediction. Previous unsupervised trackers pay attention to efficient and effective appearance model update. Built upon this foundation, an unsupervised discriminative-correlation-filters-based (DCF-based) tracker STRCF [1] is adopted by UHP-SOT as the baseline in the first module. Yet, the use of the first module alone has shortcomings such as failure in tracking loss recovery and being weak in box size adaptation. We propose ideas for background motion modeling and trajectory-based box prediction to address the mentioned problems. The baseline tracker gets initialized at the first frame. For the following frames, UHP-SOT gets proposals from all three modules and chooses one of them as the final prediction based on a fusion strategy, as shown in Fig. 1. Fig. 2 shows example results on sequences from the OTB-2015 [2] benchmark. Our tracker runs [...]

By Zhiruo Zhou|July 18th, 2021|News|Comments Off|

Permalink Gallery
MCL Research on Object Detection

MCL Research on Object Detection

Object detection is one of the most essential and challenging tasks in computer vision, while most state-of-the-art object detection methods adopt an end-to-end deep neural network, we aim at an interpretable framework that has low complexity, high efficiency in training, and high performance. The method is built upon the PixelHop framework, as shown in fig 1. The term “hop” denotes the neighborhood of a pixel. Pixelhop conducts spectral analysis on neighborhoods of different sizes centered on a pixel through a sequence of cascaded dimension reduction units. The neighborhoods of an object contain representative patterns of the objects such as salient contours and, as a result, they have distinctive spectral signatures at a certain scale that matches the object size, thus bounding boxes and class labels can be predicted based on supervised learning with Saab coefficients in proper hops as the representations.

Our method takes YOLO’s problem formulation as reference and ensembles three major modules to finish the object detection task. As shown in fig.1, by proper settings of Pixelhop, we divide all the objects into three different scales, i.e. large(as shown blue), medium (as shown in green), and small (as shown in red), and have hops with proper receptive field (RF) responsible for proposing corresponding anchor boxes for different scales (as shown comparing with the “cat” example). With the Saab coefficients at each hop, we propose anchor boxes at each spatial location, and for each anchor box we train module 1 to predict its confidence score, module 2 to predict its class label, module 3 to predict its box regression. Eventually for each image our model will first propose potential boxes and use non max suppression based on confidence score to keep the best proposed [...]

By Zhiruo Zhou|July 11th, 2021|News|Comments Off|

Permalink Gallery
MCL Research on SSL-based Image Classification

MCL Research on SSL-based Image Classification

Image classification has been studied for many years as a fundamental problem in computer vision. With the development of convolutional neural networks (CNNs) and the availability of larger scale datasets, we see a rapid success in the classification using deep learning for both low- and high-resolution images. Although being effective, one major challenge associated with deep learning is that its underlying mechanism is not transparent.

Being inspired by deep learning, the successive subspace learning (SSL) methodology was proposed by Kuo et.al. in a sequence of papers. Different from deep learning, SSL-based methods learn feature representations in an unsupervised feedforward manner using multi-stage principle component analysis (PCA). Joint spatial-spectral representations are obtained at different scales through multi-stage transforms. Three variants of the PCA transform were developed. They are the Saak transform [1], the Saab transform [2], and the channel-wise (c/w) Saab transform [4]. Two SSL-based image classification pipelines, PixelHop [3] and PixelHop++ [4], were designed based on the Saab transform and c/w Saab transform respectively. Both follow the traditional pattern recognition paradigm and partition the classification problem into two cascaded modules: 1) feature extraction and 2) classification. Every step in PixelHop/PixelHop++ is explainable, and the whole solution is mathematically transparent.

To further improve the performance, we propose a SSL-based two-stage sequential image classification pipeline, named E-PixelHop method. The motivation is that for a multi-class classification problem, it is easier to distinguish between classes of dissimilarity than those of similarity. For example, one should distinguish between cats and cars better than between cats and dogs. Along this line, one can build a hierarchical relation among multiple classes based on their semantic meaning to improve classification performance. Instead of manually constructing the hierarchical learning structure before classification, E-PixelHop resolves [...]

By Zhiruo Zhou|July 4th, 2021|News|Comments Off|

Permalink Gallery
MCL Research on Texture Synthesis

MCL Research on Texture Synthesis

Automatic synthesis of visually pleasant texture that resembles exemplary texture finds applications in computer graphics. Texture synthesis has been studied for several decades since it is also of theoretical interest in texture analysis and modeling. Texture can be synthesized pixel-by-pixel or patch-by-patch based on an exemplary pattern. For the pixel-based synthesis, a pixel conditioned on its squared neighbor was synthesized using the conditional probability and estimated by a statistical method. Generally, patch-based texture synthesis yields higher quality than pixel-based texture synthesis. Yet, searching the whole image for patch-based synthesis is extremely slow. To speed up the process, small patches of the exemplary texture can be stitched together to form a larger region. Although these methods can produce texture of higher quality, the diversity of produced textures is limited. Besides texture synthesis in the spatial domain, texture images from the spatial domain can be transformed to the spectral domain with certain filters (or kernels), thus exploiting the statistical correlation of filter responses for texture synthesis. Commonly used kernels include the Gabor filters and the steerable pyramid filter banks.

We have witnessed amazing quality improvement of synthesized texture over the last five to six years due to the resurgence of neural networks. Texture synthesis based on deep learning (DL), such as Convolutional Neural Networks (CNNs) and Generative Adversarial Networks(GANs), yield visually pleasing results. DL-based methods learn transform kernels from numerous training data through end-to-end optimization. However, these methods have two main shortcomings: 1) a lack of mathematical transparency and 2) a higher training and inference complexity. To address these drawbacks, we investigate a non-parametric and interpretable texture synthesis method, called NITES [1].

NITES consists of three steps. First, it analyzes the exemplary texture to obtain its joint spatial-spectral [...]

By Zhiruo Zhou|June 27th, 2021|News|Comments Off|

Previous 4 567 8 Next

ZhiruoZhou

MCL Research on Negative Sampling for Knowledge Graph Learning

MCL Research on Negative Sampling for Knowledge Graph Learning

MCL Research on GAN-generated Fake Images Detection

MCL Research on GAN-generated Fake Images Detection

MCL Research on Point Cloud Registration

MCL Research on Point Cloud Registration

MCL Research on Image Generation

MCL Research on Image Generation

MCL Research on Large-scale 3D Indoor Scene Semantic Segmentation

MCL Research on Large-scale 3D Indoor Scene Semantic Segmentation

MCL Research on Image Steganalysis

MCL Research on Image Steganalysis

MCL Research on Object Tracking

MCL Research on Object Tracking

MCL Research on Object Detection

MCL Research on Object Detection

MCL Research on SSL-based Image Classification

MCL Research on SSL-based Image Classification

MCL Research on Texture Synthesis

MCL Research on Texture Synthesis

Recent Posts