News

MCL Research on Point Cloud Compression

Point Cloud Compression (PCC) has received a lot of attention in recent years due to its wide applications such as virtual reality (VR), augmented reality (AR), and mixed reality (MR). Video-based PCC (V-PCC) and geometry-based PCC (G-PCC) are two distinct technologies developed by MPEG 3DG[1][2]. Deep-learning-based (DL-based) PCC is a strong competitor to them. Most DL methods generalize the DL-based image coding pipeline to the point cloud data [3][4]. They outperform G-PCC in the current MPEG 3DG standard in the dense point cloud compression. Yet, their performances are still inferior to that of V-PCC in the coding of dynamic point clouds.

We propose to design a learning-based PCC solution that could outperform those DL-based methods with lower complexity and less memory consumption. Our method uses geometry projection to generate 2D images and apply vector quantization-based 2D image codec to compress the projected map. For a point cloud sequence, we can do the projection in three steps. First, split the sequence into blocks by doing the octree partition. Second, project each 3D block into a plane and pack all the planes into a map. Third, encode/decode the 2D map and reconstruct the 3D point cloud sequence. They are demonstrated in Fig.1. We do the non-uniform sampling for the projected planes and pack all the planes to generate one depth map and one texture map in the reconstruction process. The two maps are shown in Fig.2.

Presently, we utilize the x264/x265 codec to code the maps. In the future, we will adopt a vector quantization-based image codec to compress the two maps.

— Qingyang Zhou

Reference

[1] S. Schwarz, M. Preda, V. Baroncini, M. Budagavi, P. Cesar, P. A. Chou, R. A. Cohen, M. Krivoku ́ca, S. Lasserre, Z. Li et [...]

By |February 6th, 2022|News|Comments Off on MCL Research on Point Cloud Compression|

MCL Research Interest in Blind Video Quality Assessment

Blind Video Quality Assessment (BVQA) aims to predict perceptual qualities solely on the received videos. BVQA is essential to applications where source videos are unavailable such as assessing the quality of user-generated content and video conferencing. Early BVQA models were distortion-specific and mainly focused on transmission and compression related artifacts. Recent work tried to consider spatial and temporal distortions jointly and trained a regression model accordingly. Although they can achieve good performance on datasets with synthetic distortions, they do not work well for user-generated content datasets. DL-based BVQA solutions were proposed recently. They outperform all previous BVQA solutions.

We propose to design a lightweight and interpretable BVQA solution that is suitable for mobile and edge devices while its performance is competitive with that of DL models. We need to select a basic processing unit for quality assessment. For a full video sequence, we can decompose it into smaller units in three ways. First, crop out a fixed spatial location to generate a spatial video (sv) patch. Second, crop out a specific temporal duration with full spatial information as a temporal video (tv) patch. Third, crop out a partial spatial region as well as a small number of frames as a spatial-temporal video (stv) patch. They are illustrated in Fig. 1. We will adopt STCs as the basic units for the proposed BVQA method. We will give each STC a BVQA score and then ensemble their individual scores to generate the ultimate score of the full video. The diagram is shown in Fig. 2.

After the STC features are extracted, we will train a classifier to each output response and then ensemble their decision scores to yield the BVQA score for one STC. For the model training, we [...]

By |January 31st, 2022|News|Comments Off on MCL Research Interest in Blind Video Quality Assessment|
  • Permalink Gallery

    Professor Kuo Appointed as EiC for APSIPA Trans. on Signal and Information Processing

Professor Kuo Appointed as EiC for APSIPA Trans. on Signal and Information Processing

MCL Director, Professor C.-C. Jay Kuo, has been appointed as the Editor-in-Chief for the APSIPA Transactions on Signal and Information Processing (ATSIP) by the APSIPA Board of Governors. His term starts from January 1, 2022, for two years.
ATSIP was established in 2014. This is the 9th year for the journal. Professor Antonio Ortega of the University of Southern California served as its inaugural EiC from 2014-2017 and Professor Tatsuya Kawahara of Kyoto University was its 2nd EiC from 2018-2021. Professor Kuo expressed his deep gratitude to both Professor Ortega and Professor Kawahara for their contributions in laying out an excellent foundation of the journal. The photo was taken on Dec. 19, 2019, when Professor Kuo and his wife visited Professor Tatsuya Kawahara at Kyoto University.
ATSIP is an open-access e-only journal in partnership with the NOW Publisher. It serves as an international forum for signal and information processing researchers across a broad spectrum of research, ranging from traditional modalities of signal processing to emerging areas where either (i) processing reaches higher semantic levels (e.g., from speech/image recognition to multimodal human behavior recognition) or (ii) processing is meant to extract information from datasets that are not traditionally considered signals (e.g., mining of Internet or sensor information). Papers published in ATSIP are indexed by Scopus, EI and ESCI, searchable on the Web of Science, and included in the IEEE Xplore database.

By |January 17th, 2022|News|Comments Off on Professor Kuo Appointed as EiC for APSIPA Trans. on Signal and Information Processing|
  • Permalink Gallery

    MCL Research Interest in Syntactic Structure Aware Sentence Similarity Modeling

MCL Research Interest in Syntactic Structure Aware Sentence Similarity Modeling

Text similarity modeling plays an important role in a variety of applications of Natural Language Processing (NLP), such as information retrieval, text clustering, and plagiarism detection. Moreover, it can work as an automatic evaluation metric in natural language generation, like machine translation and image captioning, so that expensive and time-consuming human labeling can be saved.

Word Mover’s Distance (WMD) [1] is an efficient model to measure the semantic distance of two texts. In WMD, word embedding which learns semantically meaningful representations for words are incorporated in earth mover’s distance. The distance between two texts A and B is the minimum cumulative distance that all words from the text A needs to travel to match exactly the text B.

We try to incorporate syntactic parsing, which brings meaningful structure information, into WMD in our work. There are mainly two parts that can control the flow in WMD. One is the distance matrix and the flow of each word. Firstly, to compute the distance matrix, the original WMD only compares an individual pair of word embeddings to measure the distance between words and doesn’t consider other information in the sentence. To measure the distance between words better, we first form sub-tree structures from the dependency parsing tree. Instead of only comparing the similarity of the word embeddings, we also compare the sub-tree similarity that contains the words. Secondly, A word’s flow can be regarded as the word’s importance. If giving more flow to important words, the most flow will transport between important words. So, the total transportation cost is mainly decided by the similarity of important words. We currently utilize the word’s dependency relation in the parsing tree to assign importance weights for words. In the future, we [...]

By |January 10th, 2022|News|Comments Off on MCL Research Interest in Syntactic Structure Aware Sentence Similarity Modeling|

Happy New Year!

At the beginning of 2022, We wish all MCL members a more wonderful year with everlasting passion and courage!

 

Image credit:

https://thenewspocket.com/100-best-happy-new-year-wishes-messages-quotes-2022/

By |January 2nd, 2022|News|Comments Off on Happy New Year!|

Merry Christmas

2021 has been a fruitful year for MCL. Some members graduated with impressive research work and began a new chapter of life. Some new students joined the MCL family and explored the joy of research. MCL members have made great efforts on their research and published quality research papers on top journals and conferences. We appreciate all efforts to all possibilities! Wish all MCL members a merry Christmas!

 

Image credits:

Image 1: https://www.freepik.com/free-vector/realistic-christmas-banner-with-branches-red-background_11210304.htm#query=merry%20christmas&position=42&from_view=keyword

Image 2: https://www.backyardcamp.ca/activities/gingerbread-christmas-cookie-trees

By |December 26th, 2021|News|Comments Off on Merry Christmas|

MCL Research on MRI Imaging (MRI Lung Ventilation)

Functional lung imaging is of great importance for the diagnosis and evaluation of lung diseases such as chronic obstructive pulmonary disease (COPD), asthma, and cystic fibrosis. Conventional methods often include inhaled hypopolarized gas or 100% oxygen as contrast agents. In recent years, high performance low field systems have shown great advantages for 1H lung MRI due to reduced susceptibility effects and improved vessel conspicuity. These allow possibilities to detect regional volume changes throughout the respiratory cycle.
Recently, under the collabration between MCL and Dynamic Imaging Science Center (DISC), the feasibility of image-based regional lung ventilation assessment from real-time low field MRI at 0.55T is studied, without requiring contrast agents, repetition, or breath holds. A sequence of MRI in the time series with 355ms/frame temporal resolution, 1.64 x 1.64 mm2 spatial resolution, and 15mm slice thickness, captures several consecutive respiratory cycles which consist of different respiratory states from exhalation to inhalation. To resolve the regional lung ventilation based on these acquired images, an unsupervised non-rigid image registration is applied to register the lungs from different respiratory states to the end-of-exhalation. Deformation field is extracted to study the regional ventilation. Specifically, a data-driven binarization algorithm for segmentation is firstly applied to the lung parenchyma area and vessels, separately. A frame-by-frame salient point extraction and matching are performed between the two adjacent frames to form pairs of landmarks. Finally, Jacobian determinant (JD) maps are generated using the calculated deformation fields after a landmark-based B-spline registration.
In the study, the regional lung ventilation is analyzed on three breathing patterns, including free breathing, deep breathing and force exhalation. The motion and volume change for deep breathing and forced exhalation are found to be larger than the free breathing case. [...]

By |December 19th, 2021|News|Comments Off on MCL Research on MRI Imaging (MRI Lung Ventilation)|

MCL Research on Unsupervised Object Tracking

Video object tracking is one of the fundamental computer vision problems. It finds rich applications in video surveillance, autonomous navigation, robotics vision, etc. Given a bounding box on the target object at the first frame, a tracker has to predict object box locations and sizes for all remaining frames in online single object tracking (SOT). The performance of a tracker is measured by accuracy (higher success rate), robustness (automatic recovery from tracking loss), computational complexity and speed (a higher number of frames per second of FPS).

Online trackers can be categorized into supervised and unsupervised ones. Supervised trackers based on deep learning (DL) dominate the SOT field in recent years. DL trackers offer state-of-the-art tracking accuracy, but they do have some limitations. First, a large number of annotated tracking video clips are needed in the training, which is a laborious and costly task. Second, they demand large memory space to store the parameters of deep networks due to large model sizes. Third, the high computational power requirement hinders their applications in resource-limited devices such as drones or mobile phones. Advanced unsupervised SOT methods often use discriminative correlation filters (DCFs) which could run fast on CPU with Fast Fourier Transform and has extra small model size. There is a significant performance gap between unsupervised DCF trackers and supervised DL trackers. It is attributed to the limitations of DCF trackers such as failure to recover from tracking loss and inflexibility in object box adaptation.

To address the above issues with a green solution, previously we proposed UHP-SOT (Unsupervised High-Performance Single Object Tracker) which used STRCF as the baseline and incorporated two new modules – background motion modeling and trajectory-based object box prediction. Our new work UHP-SOT++ is an [...]

By |December 12th, 2021|News|Comments Off on MCL Research on Unsupervised Object Tracking|

MCL Research on Point Cloud Odometry

Odometry is the process of using motion sensors to estimate the change in position of an object over time. It has been widely studied in the context of mobile robots, autonomous vehicles, drones, and other moving agents. Traditional odometry based on motion sensors such as Inertial Measurement Unit (IMU) and magnetometers is prone to error accumulation over time, known as odometry drift. Visual odometry makes use of camera images and/or point cloud scans collected over time to determine the position and orientation of the moving object. Several visual odometry systems that integrate monocular, stereo vision, point clouds and IMU have been developed for object localization.
We propose an unsupervised learning method for visual odometry from LiDAR point clouds called Green Point Cloud Odometry (GPCO). GPCO follows the traditional scan matching based approach to solve the odometry problem by incrementally estimating the motion between two consecutive point cloud scans. The GPCO method can be divided into four steps. First, a geometry-aware sampling method selects a small subset of points from the input point clouds. To do so, the eigen features of points in a local neighborhood are considered followed by random point sampling. Next, the 3D view surrounding the moving object is partitioned into four parts representing the front, rear, left and right-side view. The view-partitioning step divides the sampled points into four disjoint sets. The features of the sampled points are derived using the PointHop++ [1] method. Matching points between two consecutive point clouds are found in each view using nearest neighbor rule in the feature space. Finally, the motion between the two scans is estimated using Singular Value Decomposition (SVD). The motion is updated to the estimates from previous times and the process [...]

By |December 5th, 2021|News|Comments Off on MCL Research on Point Cloud Odometry|

MCL Thanksgiving Luncheon

MCL had the annual Thanksgiving Luncheon at Shiki Seafood Buffet on November 25, 2021. The Thanksgiving Luncheon has been a tradition of MCL for more than 20 years. It’s a good chance for the whole group to gather and have a lunch together as a warm and happy family. All of us enjoyed the dilicious food and the wonderful time chatting with each other. It’s also a good opportunity to have a rest after a busy semester and get connected with other people during this hard time. Thank Professor Kuo for holding this event and thank Min for organizing it.

Happy Thanksgiving to everyone!

By |November 28th, 2021|News|Comments Off on MCL Thanksgiving Luncheon|