News

  • Permalink Gallery

    Congratulations to Qingyang Zhou  for Passing His Defense!

Congratulations to Qingyang Zhou  for Passing His Defense!

Congratulations to Qingyang Zhou on successfully defending his dissertation today! His thesis, titled “Advanced Techniques for Point Cloud Quality Assessment, Surface Reconstruction, and Coding,” was reviewed by his committee—chaired by Jay Kuo, with Antonio Ortega as a member and Stefanos Nikolaidis serving as the outside examiner. The committee praised the rigor and excellence of Qingyang’s research. Here is a summary of his thesis:Point clouds (PCs) play a critical role in 3D vision and graphics applications such as AR/VR, digital preservation, and medical imaging. However, their large data volume, vulnerability to quality degradation, and the need for surface reconstruction from sparse, noisy points pose long-standing challenges. In this work, we address these issues through three research thrusts: compression, quality assessment, and surface reconstruction. First, we propose GPCGC, a low-complexity geometry compression method that leverages vector quantization and block-level rate-distortion modeling to achieve fast encoding/decoding with competitive performance. Second, we introduce GPQA, an interpretable, saliency-guided quality assessment framework that projects local 3D structures into 2D patches and uses a green machine learning model to predict perceptual quality under both full- and no-reference settings. Third, we develop two surface reconstruction methods: GPSR, an unsupervised diffusion-based approach, and LPSR, a supervised variant under the green learning paradigm. Together, these contributions advance the efficiency, interpretability, and robustness of the point cloud processing pipeline, enabling practical deployment in resource-constrained environments.He generously shared his experiences in the MCL Lab with us:My PhD journey at the University of Southern California’s Media Communications Lab (MCL) has been one of the most rewarding chapters of my life. Under the guidance of Professor C.-C. Jay Kuo, and with the generous support of MCL alumni and lab members, I not only deepened my technical knowledge but [...]

By |May 11th, 2025|News|Comments Off on Congratulations to Qingyang Zhou  for Passing His Defense!|

MCL Research on Prostate Segmentation

Automatic segmentation of the prostate is a crucial step in the computer-aideddiagnosis of prostate cancer and in treatment planning. Current methods for prostatesegmentation primarily rely on deep learning models with neural networks. However,these models tend to be large and lack transparency, which is essential forphysicians. We proposed a new data-driven 3D prostate segmentation method onMRI named Green U-shaped Learning (GUSL). Different from deep learning basedmethods, GUSL employs a feed-forward system that utilizes successive subspacelearning (SSL).To keep enough detailed information on a dataset with a large image size, wepropose a cascading model in two stages, as shown in Figure 1: (1) segmentation ondownsampled images and (2) segmentation on the cropped patches. GUSL consistsof three main modules, as shown in Figure 2: representation learning, featurelearning, and residual correction. All modules are applied at multiple levels withvarying resolutions. We achieve fine-to-coarse unsupervised representation learning

using cascaded VoxelHop units, as well as coarse-to-fine segmentation throughfeature learning and residual correction. GUSL maintains a very competitive standingperformance-wise with other DL baseline models and keeps a smaller model sizeand less complexity, with transparency for doctors.

By |May 4th, 2025|News|Comments Off on MCL Research on Prostate Segmentation|

MCL Research on Green Image Super-resolution

Single image super-resolution (SISR) is an intensively studied topic in image processing. It aims at recovering a high-resolution (HR) image from its low-resolution (LR) counterpart. SISR finds wide real-world applications such as remote sensing, medical imaging, and biometric identification. Besides, it attracts attention due to its connection with other tasks (e.g., image registration, compression, and synthesis). 

The main challenge of SISR is the ill-pose issue. We recently have been developing a solution by providing reasonable performance and effectively reduced complexity. We propose a green U-shape method to progressively enhance the LR images from global structure to local details with increasing spatial sizes and conditional residual estimation. 

By |April 27th, 2025|News|Comments Off on MCL Research on Green Image Super-resolution|

MCL Research on Nuclei Segmentation

Nuclei Segmentation is a key step in understanding the distribution, size, and shape of nuclei in the underlying tissue. Traditionally, pathologists view histology slides under the microscope to analyze the nuclear structure. However, this process is time-consuming and is prone to inter-reader variability. An AI-based segmentation algorithm can aid pathologists in cancer detection and prognosis, and help speed up the cancer screening procedure. 

While there are several deep learning methods addressing this problem, we propose a Green Nuclei Segmentation algorithm that uses a simple, reliable, and modular approach to delineate nuclei in a histopathology slide. The Green U-shaped Learning(GUSL) is a 4 level pipeline that involves three main modules: representation learning using PixelHop, feature selection using RFT, and supervised learning using XGBoost Regressor. The different levels help look at the histopathology image at multiple resolutions, while we attempt to segment the nuclei in a coarse to fine manner. At each level, we aim to correct the previous layer’s predictions through residue correction. While this model gives good results, we can further improve the performance by refining the boundary regions to yield precise nuclei contours. 

By |April 20th, 2025|News|Comments Off on MCL Research on Nuclei Segmentation|

MCL Research on Seismic Data Processing

Seismic waves are mechanical waves generated by earthquakes that travel through the Earth. Body waves consist of fast, compressional primary (P) waves and slower, shear secondary (S) waves. With large datasets of seismogram recordings, researchers train machine learning models to automatically pinpoint P‑ and S‑wave arrival times. This is essential for real‑time seismic monitoring and early warning systems.

Our Green Learning framework streamlines this process while boosting interpretability. We begin by slicing raw seismic recordings into overlapping three‑channel windows and assigning each a continuous pseudo‑label (ranging from 0 to 1) that reflects how accurately it is aligned to a P‑ or S‑wave onset. Treating these windows as 3‑channel images, we extract multi‑scale features via multiple Saab transform layers and select the most powerful features at each scale using Relevant Feature Test (RFT) modules. An XGBoost regressor then produces a continuous output signal, from which P‑ and S‑wave arrivals are simply recovered by peak detection. Compared to the SotA deep learning model EQTransformer, this model uses far fewer parameters,

By |April 13th, 2025|News|Comments Off on MCL Research on Seismic Data Processing|

MCL Research on Image Denoising

Image denoising is a computer vision technique that removes noise from images while preserving essential structures and textures. It plays a critical role in applications such as photography enhancement, medical imaging, and remote sensing.

To address such problems, we have employed GUSL, a Green Learning-based pipeline tailored for image denoising. Noisy images are resized to multiple resolutions, and Green Learning techniques such as PixelHop, RFT, and LNT are applied at each level to extract features independently. Each level progressively refines the denoising result by correcting the residuals from the previous level. While this approach yields promising results, further refinement is needed to enhance performance in smooth and texture-rich regions.

By |April 6th, 2025|News|Comments Off on MCL Research on Image Denoising|

MCL Research on Video-Text Retrieval

Image-text retrieval is a fundamental task in image understanding. This task aims to retrieve the most relevant information from another modality based on the given image or text. Recent approaches focus on training large neural networks to bridge the gap between visual and textual domains. However, these models are computationally expensive and not explainable regarding how the data from different modalities are aligned. End-to-end optimized models, such as large neural networks, can only output the final results, making it difficult for humans to understand the reasoning behind the model’s predictions.

Hence, we propose a green learning solution, Green Multi-Modal Alignment (GMA), for computational efficiency and mathematical transparency. We reduce trainable parameters to 3% compared to fine-tuning the whole image and text encoders. The model is composed of three modules, including (1) Clustering, (2) Feature Selection, and (3) Alignment. The clustering process divides the whole dataset into subsets by choosing similar image and text pairs, reducing the training sample’s divergence. The second module, feature selection, reduces the feature dimension and mitigates the computational requirements. The importance of each feature can be interpreted as statistical evidence supporting our reasoning. The alignment is conducted by linear projection, which guarantees the inverse projection in both direction retrievals, namely image-to-text and tex-to-image retrievals.

Experimental results show that our model can outperform the SOTA retrieval models in text-to-image and image-to-text retrieval on the Flick30k and MS-COCO datasets. Besides, our alignment process can incorporate visual and text encoder models trained separately and generalize well to unseen image-text pairs.

By |March 30th, 2025|News|Comments Off on MCL Research on Video-Text Retrieval|

MCL Research on Enhanced Object Detection

Enhancing image feature extraction to boost image classification accuracy has been a significant research focus at the MCL lab. Initially, PixelHop++ was developed to efficiently extract image features and perform accurate image classification. Subsequently, the Least-Squares Normal Transform (LNT) was introduced to further enhance image features, improving classification results with PixelHop++ on standard image databases such as MNIST and FMNIST. Despite achieving commendable performance, further refinements remain desirable to push accuracy limits even higher.

To address this, we propose a novel pipeline of four distinct experimental setups involving different pooling strategies—absolute maximum pooling and variance pooling—at hops 1 and 2. We extract LNT features specifically from hops 1, 3, and 4 for each experiment. At hop-1, pooling (either max or variance) generates 10 LNT features per channel, resulting in a total of 250 features. Hop-3 involves transforming the (N, 3, 3, Feature) tensor to produce 90 LNT features. From hop-4, 10 additional LNT features are acquired following a DFT-based feature selection. These 350 LNT features from hops 1, 3, and 4 are concatenated alongside selected hop-4 features. Finally, features aggregated from all four experimental setups are combined, and a 10-class classifier is trained on these comprehensive feature sets, demonstrating an improvement in classification performance.

By |March 23rd, 2025|News|Comments Off on MCL Research on Enhanced Object Detection|

MCL Research on Video Camouflaged Object Detection (VCOD)

Video Camouflaged Object Detection (VCOD) focuses on identifying and segmenting objectsconcealed within the background scenes. These camouflaged objects closely resemble theirsurroundings by mimicking similar color patterns and textures, which poses significant challengescompared to conventional detection tasks.To address this problem, we have proposed a motion-enhanced approach that progressivelyrefines the detection results with multi-resolution search and motion-guided boosting. The videoframe is first screened under image level, and the inter-frame motions and background models are then corrected, considering all the video sequences. This method provides stable performance under popular VCOD datasets.

By |March 16th, 2025|News|Comments Off on MCL Research on Video Camouflaged Object Detection (VCOD)|

MCL Research on Image Dehazing

Image dehazing plays a crucial role in digital imaging by removing atmospheric distortions such as haze and fog, thereby enhancing scene clarity for applications ranging from photography to autonomous driving. Traditionally, methods like the Dark Channel Prior (DCP) have been used to estimate haze effects, leveraging the observation that most non-hazy images contain very dark pixels in at least one color channel.

In a significant advancement, researchers have now introduced a novel approach that combines the strength of DCP with the efficiency of the GUSL pipeline. In this new method, DCP serves as the foundational technique to provide an initial estimate of the haze, while the GUSL pipeline is employed to predict and correct the residual errors left by the DCP. This two-step process refines the dehazing process by capturing subtle details that DCP alone might miss.

The GUSL pipeline utilizes unsupervised representation learning for robust feature extraction, followed by supervised feature learning to enhance computational efficiency and output quality. This approach not only improves the overall dehazing performance but also maintains a lightweight design suitable for real-time applications on resource-constrained devices.

By integrating DCP with residue prediction through GUSL, the new method delivers superior image clarity with reduced computational overhead, making it an attractive solution for modern imaging challenges in mobile and edge computing environments.

By |March 9th, 2025|News|Comments Off on MCL Research on Image Dehazing|