USC Media Communications Lab

Congratulations to Mahtab Movahhedrad for Passing Her Qualifying Exam

Congratulations to Mahtab Movahhedrad for passing her qualifying exam! Her thesis proposal is titled “Explainable Machine Learning for Efficient Image Processing and Enhancement.” Her Qualifying Exam Committee members include Jay Kuo (Chair), Antonio Ortega, Bhaskar Krishnamachari, Justin Haldar, and Mejsam Razaviyayn (Outside Member). Here is a summary of her thesis proposal:

Image Signal Processors (ISPs) are critical components of modern imaging systems, responsible for transforming raw sensor data into high-quality images through a series of processing stages. Key operations such as demosaicking and dehazing directly influence color fidelity, detail preservation, and visual clarity. While traditional methods rely on handcrafted models, deep learning has recently shown strong performance in these tasks, albeit at the expense of computational efficiency and energy consumption.

With the increasing demand for mobile photography, balancing image quality with resource efficiency has become essential, particularly for battery-powered devices. This work addresses the challenge by leveraging the principles of green learning (GL), which emphasizes compact model architectures and reduced complexity. The GL framework operates in three cascaded stages—unsupervised representation learning, semi-supervised feature learning, and supervised decision learning—allowing efficient, interpretable, and reusable solutions.

Building on this foundation, my work introduces three methods: Green Image Demosaicking (GID), Green U-Shaped Image Demosaicking (GUSID), and Green U-Shaped Learning Dehazing (GUSL-Dehaze). GID offers a modular, lightweight alternative to conventional deep neural networks, achieving competitive accuracy with minimal resource usage. GUSID extends this efficiency with a U-shaped encoder–decoder design that enhances reconstruction quality while further reducing complexity. Finally, GUSL-Dehaze combines physics-based modeling with green learning principles to restore contrast and natural colors in hazy conditions, rivaling deep learning approaches at a fraction of the cost.

Together, these contributions advance ISP design by delivering high-quality, interpretable, and energy-efficient imaging solutions suitable for mobile and embedded platforms.

By Catherine Aurelia Christie Alexander|September 7th, 2025|News|Comments Off|

Permalink Gallery
MCL Research on Green IR Drop Prediction

MCL Research on Green IR Drop Prediction

This work introduces Green IR Drop (GIRD), an energy-efficient and high-performance static IR-drop estimation method built on green learning principles. GIRD processes IC design inputs in three stages. First, the circuit netlist is transformed into multichannel maps, from which joint spatial–spectral representations are extracted using PixelHop. Next, discriminant features are identified through the Relevant Feature Test (RFT). Finally, these selected features are passed to an eXtreme Gradient Boosting (XGBoost) regressor. Both PixelHop and RFT belong to the family of green learning tools. Thanks to their lightweight design, GIRD achieves a low carbon footprint with significantly smaller model sizes and reduced computational complexity. Moreover, GIRD maintains strong performance even with limited training data. Experimental results on both synthetic and real-world circuits confirm its superior effectiveness. In terms of efficiency, GIRD’s model size and floating-point operation count (FLOPs) are only about 10⁻³ and 10⁻² of those required by deep learning methods, respectively.

By Catherine Aurelia Christie Alexander|August 24th, 2025|News|Comments Off|

Permalink Gallery
Attendance at MIPR 2025 – San Jose

Attendance at MIPR 2025 – San Jose

The 2025 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) was held in San Jose from August 6 to 8. The event commenced with three keynote addresses: Prof. Prem Devanbu (University of California, Davis) discussed the reliability of large language models for code generation and offered guidance on when their results can be trusted. Prof. Edward Y. Chang (Stanford University) presented adaptive multi-modal learning as a way to address LLM limitations. Dr. Ed H. Chi (Google DeepMind) spoke on the future of AI-assisted discovery, highlighting systems that enhance rather than replace human expertise.During the conference, Mahtab Movhhedrad, a member of the Media Communications Lab (MCL), presented the paper “GUSL-Dehaze: A Green U-Shaped Learning Approach to Image Dehazing.” This work introduced GUSL-Dehaze, a physics-based green learning framework for image dehazing that completely avoids deep neural networks. The method begins with a modified Dark Channel Prior for initial dehazing, followed by a U-shaped architecture enabling unsupervised representation learning. Feature-engineering techniques such as Relevant Feature Test (RFT) and Least-Squares Normal Transform (LNT) were employed to keep the model compact and interpretable. The final dehazed image is produced through a transparent supervised learning stage, allowing the method to achieve performance comparable to deep learning approaches while maintaining a low parameter count and mathematical transparency.The conference also included a panel session, “Learning Beyond Deep Learning (LB-DL) for Multimedia Processing,” chaired by Prof. Ling Guan (Toronto Metropolitan University/Ryerson University) and Prof. C.-C. Jay Kuo (University of Southern California, Director of the MCL Lab). Prof. Kuo discussed emerging paradigms that challenge the dominance of deep learning, emphasizing the growing importance of interpretability, efficiency, and sustainability in shaping the next generation of multimedia research.

By Mahtab Movahhedrad|August 17th, 2025|News|Comments Off|

Permalink Gallery
MCL Research on Eosinophilic esophagitis (EoE) Diagnosis

MCL Research on Eosinophilic esophagitis (EoE) Diagnosis

Eosinophils are a type of white blood cell that can both protect the body and cause disease. While they are essential for fighting certain parasitic infections, they are also a primary cause of many allergic conditions when they don’t function correctly. It’s important to understand these cells because their buildup and activation are the main reasons for tissue damage in diseases like asthma and Eosinophilic Esophagitis (EoE).

Our current work explores the use of a dictionary learning pipeline to get unsupervised representations of whole-slide images of EoE. Unlike deep learning methods that require back-propagation to optimize millions of parameters to get data representations, our method represents the data in a self-organized way, with no back-propagation at all.

Compared to many other machine learning methods, our new pipeline provides a more transparent and explainable approach, especially for medical image analysis with smaller, specialized datasets.

By Catherine Aurelia Christie Alexander|August 10th, 2025|News|Comments Off|

Permalink Gallery
MCL Research on MRI Prostate Image Quality Assessment

MCL Research on MRI Prostate Image Quality Assessment

Magnetic Resonance Imaging(MRI) is a non-invasive, radiation-free scanning procedure generally used to obtain images of internal organs. This imaging modality is a popular screening technique for Prostate Cancer. A Prostate MRI may be used to detect prostate cancer, determine the requirement for biopsy, guide needles during targeted MRI biopsy, or detect the spreading of cancer to neighboring areas. Current standards of MRI acquisition still lead to errors, like false detections, where patients are unnecessarily sent to biopsy, or missed detections, where existing tumors remain undetected. A good quality MRI is essential to ensure that Prostate Cancer is diagnosed on time.

An MRI has several sequences, namely T2W, ADC, and DCE. For an MRI to be of diagnostic quality, atleast two of the three sequences must independently be of diagnostic quality. Errors/artifacts may be present in some or all of these sequences. Some common artifacts include motion artifacts, rectal gas, hip prosthesis, etc., that affect the quality of the MRI.

In order to assess the quality of an MRI, we train independent models for each of these sequences. While MRI images are volumetric, we treat each slice of the MRI independently. 2D Haar Wavelet Transform is applied to extract features from the LL, LH, HL, and HH bands. These features are extracted at two different resolutions. The Discriminant Feature Test(DFT) is used to reduce the feature dimension by removing features with a high DFT loss. An XGBoost Classifier is then trained using these selected features to predict whether each slice of the MRI is of satisfactory quality or not. The quality predictions from each of the three sequences are then combined to obtain the final MRI quality prediction. This approach is light-weight, efficient, and explainable, with [...]

By Catherine Aurelia Christie Alexander|August 3rd, 2025|News|Comments Off|

Permalink Gallery
MCL Research on Biomarker Prediction for Kidney Cancer

MCL Research on Biomarker Prediction for Kidney Cancer

The tumor microenvironment includes many types of cells around the tumor. Doctors often assess it using biomarkers like PD-L1 or CD68. To measure how many cells express these markers—called positive cells—we use image analysis and machine learning models to identify positive cells and compute their count and ratio. Usually, immunohistochemistry (IHC) images are used as the input of the model because they are relatively low-cost, while multiplex immunofluorescence (IF) images are used as ground truth due to their high accuracy. A major research goal is to predict positive cell locations from IHC images.

Our current work explores the use of the Green U-shaped Learning (GUSL) pipeline to align input IHC images with the ground truth IF images. GUSL is well-suited for this task because it enables pixel-wise prediction from coarse to fine resolution. It can detect positive cells at a coarse level and progressively refine predictions. GUSL has also shown strong performance in related tasks like kidney segmentation.

Another approach we explore is using GUSL to generate predictions on other medical images, such as H&E-stained images, DAPI, and LAP2. By producing segmentation results from H&E, DAPI, LAP2, and IHC inputs, and then taking a weighted average, we aim to further improve prediction accuracy.Currently, many machine learning methods have been developed to solve this problem, but they often suffer from high computational cost (FLOPs) and large model size. In addition, the limited size of medical datasets presents another major challenge. Green learning offers a promising solution to these issues and contributes to noninvasive biomarker prediction in this research field, helping reduce the need for expensive and labor-intensive staining methods.

By Catherine Aurelia Christie Alexander|July 27th, 2025|News|Comments Off|

Permalink Gallery
MCL Research on Wavelet-Based Green Learning

MCL Research on Wavelet-Based Green Learning

Wavelet-based Green Learning (GreenWave) is a new image classification framework that combines the multi-scale power of wavelet transforms with the efficiency and interpretability of Green Learning principles. It avoids backpropagation and replaces traditional deep learning architectures with a transparent, feedforward pipeline.

At its core, GreenWave begins by applying a discrete wavelet transform (DWT)—typically using Haar wavelets—to each image, capturing both local and global spatial structures at multiple resolutions. It extracts features from wavelet subbands (like LL, LH, HL, HH) as well as local image patches from different regions (e.g., east, south, west, north). These features are used to construct class templates via averaging over training examples.

GreenWave operates in three rounds of classification:1. Round-1 classifies easy samples using cosine similarity between the input and class templates (one-vs-rest). It uses confidence metrics (like entropy) to decide which samples are “easy” and can exit the pipeline early.2. Round-2 focuses on semi-hard samples using updated templates and discriminant feature test (DFT) masks to emphasize class-informative coefficients. It also introduces one-vs-one templates to resolve class confusion.3. Round-3 targets the hardest samples using further refined templates and updated confusion masks, maximizing accuracy while maintaining interpretability.Throughout all stages, GreenWave uses cosine similarity as its feature-matching metric and XGBoost classifiers for decision learning, completely bypassing gradient-based training.

Overall, GreenWave demonstrates that a non-backpropagation, wavelet-template-based system can achieve near state-of-the-art performance while being highly efficient, explainable, and modular. This makes it an ideal choice for low-resource or transparent AI applications.

By Catherine Aurelia Christie Alexander|July 20th, 2025|News|Comments Off|

Permalink Gallery
MCL Research on Motion YOLO

MCL Research on Motion YOLO

Video object detection is a demanding computer vision topic that extends static image-based detection by introducing camera motion and temporal dynamics. This brings significant challenges, such as occlusion, scene blurriness, and dynamic shape changes caused by object and camera movement. Nevertheless, the temporal correlations between frames and the motion patterns of objects also provide rich and valuable information. The object detection and tracking over time enables machines to understand dynamic scenes and make informed decisions in complex environments. Nowadays, video object detection has become essential for many real-world applications, including autonomous driving, intelligent surveillance, human-computer interaction, and video content analysis.

Existing image-based detection models have achieved remarkable success, offering excellent accuracy and real-time detection capabilities in static scenarios. However, directly applying these models to video introduces several issues. Specifically, image-based models treat video frames independently, ignoring temporal relationships across frames, which often leads to unstable detection results in complex scenes and redundant computations for similar consecutive frames. Moreover, in real-world scenarios, videos are typically stored in compressed formats before being uploaded or transmitted. Fully decompression of the video further increases the computational overhead.

We propose Motion-Assisted YOLO (MA-YOLO), an efficient video object detection strategy that leverages the motion information naturally embedded in compressed video streams while utilizing existing image-based detection models to address the aforementioned challenges. Specifically, we adopt YOLO-X variants as our base detector for static images. Rather than performing detection on every video frame, we detect objects only on selected keyframes and propagate the predictions to estimate detection results for intermediate frames. The proposed framework consists of three modules: (1) keyframe selection and sparse inference, (2) motion vector extraction and pixel-wise assignment, and (3) motion-guided decision propagation. By incorporating the keyframe-based detection and motion [...]

By Catherine Aurelia Christie Alexander|July 13th, 2025|News|Comments Off|

Permalink Gallery
MCL Research on Multi-Stage XGBoost

MCL Research on Multi-Stage XGBoost

XGBoost is a widely used gradient-boosting framework in machine learning. Due to its low resource consumption and interpretability for sequential learning, XGBoost has gained increasing attention in green learning, where computational efficiency and model transparency are essential. However, XGBoost is not optimized in the current green learning literature. It accounts for a large part of the model parameters in a green learning system. It is desired to reduce the size of XGBoost while maintaining its high performance.

We recently have proposed a new method called Multi-stage XGBoost (MXB), a modular framework that builds a sequence of XGBoost models. MXB adopts a sequence of shallow XGBoost models, each of which defines a stage. Unlike conventional XGBoost, which is trained on the full feature set, the model at each stage of MXB is trained on a distinct feature subset. The general idea is to put more discriminant ones in earlier stages and less discriminant ones in later stages. This is motivated by the convergence curve of the XGBoost classifier and regressor. Also, the transition from one XGBoost model to the next in MXB is governed by the behavior of the gradient and Hessian, which represent the first- and second-order derivatives of the objective function, respectively. The gradient quantifies the direction and rate of change of the loss function, indicating how far the model is from optimality. The Hessian, on the other hand, measures the curvature of the loss, offering insight into the local stability of convergence.

We evaluate MXB on two datasets: MNIST and Fashion-MNIST, and compare it with standard XGBoost models of equivalent depth and model size. The training and testing curves show that MXB has smaller train–validation gaps and more stable testing loss as compared to [...]

By Catherine Aurelia Christie Alexander|July 6th, 2025|News|Comments Off|

Permalink Gallery
Welcome New MCL Member James Zhan

Welcome New MCL Member James Zhan

We are very happy to welcome a new MCL member, James Zhan. Here is a quick interview with James:

1. Could you briefly introduce yourself and your research interests?

My name is Yunkai Zhan, and people always call me James. I’m currently a rising junior undergraduate student major in computer science. My research interest includes computer vision, machine learning, and computational social science. Outside of academia, I’m a basketball and football (soccer but it should be football) fan and player. I also play poker and love watching movies.

2. What is your impression of MCL and USC?

I’m very grateful to join MCL to work with Dr. Kuo and all the students. Dr. Kuo is always supporting his students and providing meaning advises and inspiring ideas. The lab has a very vibrant vibe. I enjoy talking to everyone during the pizza time before the seminar. If I haven’t talk to you, feel free to reach out to me next time.

3. What is your future expectation and plan in MCL?

I’m looking forward to contribute to projects starting this fall to learn from the process of self studying and collaborate with experienced researchers. I’m really excited to work on transparent and interpretable Green Learning.

By Catherine Aurelia Christie Alexander|June 29th, 2025|News|Comments Off|

Previous 123 4 Next

News