USC Media Communications Lab

Permalink Gallery
Merry Christmas!

Merry Christmas!

As 2025 comes to a close, we look back with gratitude on a year marked by growth, achievement, and abundant blessings at MCL. This year, we bid a heartfelt farewell to our esteemed graduates, whose impactful research has left a lasting legacy as they embark on exciting new journeys. At the same time, we were delighted to welcome new members into the MCL family—bringing fresh perspectives, energy, and enthusiasm for innovative research.

Through dedication, collaboration, and resilience, our community reached significant milestones, including the publication of outstanding work in leading journals and conferences. These accomplishments reflect the collective passion, perseverance, and excellence of every member of MCL.

From all of us at MCL, we wish you a Merry Christmas filled with love, laughter, and cheer. May this festive season bring peace, happiness, and many moments to cherish.

By Catherine Aurelia Christie Alexander|December 21st, 2025|News|Comments Off|

Permalink Gallery
MCL Research on Video Quality Assessment

MCL Research on Video Quality Assessment

Blind video quality assessments (BVQA) has become a key part of video streaming pipelines, especially with the rise of short-form user generated content (UGC). Without a reference video, BVQA estimates the quality of the underlying video content using human guided metrics including mean opinion scores (MOS). Many state-of-the-art methods employ large CLIP modules leading to increasingly larger deep learning based pipelines. Given the rapid growth of UGC, lightweight models may save streaming platforms massive amounts of compute and power.

We are focused on developing a green learning alternative by incorporating raw features from specific sub-domains. Our research has found that a fusion of raw features that capture global and local detail combined with semantic information appears to provide sufficient information to predict MOS, even with trivial temporal schemes such as mean pooling. Currently, we generate raw features from natural scene statistic models including BRISQUE and V-BLIINDs, while local and semantic information is captured by a pre-trained Swin-T model. We plan on further reducing our model size using alternative feature extractors including EfficientNet, MobileNet, or the discrete wavelet transform.

By Catherine Aurelia Christie Alexander|December 14th, 2025|News|Comments Off|

Permalink Gallery
MCL Research on Segmentation of Mice Brain Images

MCL Research on Segmentation of Mice Brain Images

Mouse brains are often used as model systems in medical studies, as they share many similarities with the human brain in function and structure. By observing biomarkers in brain tissue, researchers can gain insight into how neurons and blood vessels interact to support processes such as learning, memory, and sensory perception (especially in response to triggers of content/stress and so on).

Our studies on 3D microscopic images of mouse brains involve two biomarkers: lectin and cFOS. Lectin is a protein which binds specifically to certain sugar molecules on the surface of cells lining blood vessels, and can therefore be stained with a fluorescent dye to mark the network of blood vessels. cFOS labelling, on the other hand, marks neurons that were recently active, serving as an “activity map” of brain regions responding to external stimuli or behavioural tasks. The detailed information of both biomarkers is obtained through the dissection of the mouse (especially cFOS, which requires observation around 2 hours after stimulation), which means that it cannot be observed through imaging of the human brain.

These datasets are typically large per brain, high-dimensional, and require pixel-level interpretation. Manual labelling for such data is often time-consuming, and could vary depending on the experience level of the person labelling. A single brain can produce gigabytes to terabytes of 3D imaging data, containing complex vessel networks and cellular activation patterns. As cFOS shows up as very small points on the image, labelling those accurately proves to be difficult in practice as well.

Green learning offers an efficient method for automatically segmenting and analysing these complex images. Deep learning often demands extensive labelled data and computational resources, and such data is often hard to find when it comes to medical [...]

By Zijing Chen|December 7th, 2025|News|Comments Off|

Permalink Gallery
MCL Thanksgiving Luncheon

MCL Thanksgiving Luncheon

For more than two decades, the Thanksgiving Luncheon has been a defining tradition of MCL, strengthening our sense of community and offering a meaningful moment to celebrate gratitude and connection. On November 27, 2025, the tradition continued as the MCL family gathered at Shiki Seafood Buffet to share a warm and memorable meal together.

Beyond the delicious food, the luncheon provided a welcome break from daily routines—a chance to reconnect with friends and unwind. The lively conversations and warm atmosphere were a reminder of the strong bonds that define the MCL community.

This year’s event was made possible through the unwavering support of Professor Kuo, whose commitment keeps this tradition thriving, as well as the hard work of the students who ensured that everything flowed seamlessly.

As we close another chapter of this cherished tradition, the MCL family looks forward to many more years of celebration, fellowship, and gratitude. Happy Thanksgiving!

By Catherine Aurelia Christie Alexander|November 30th, 2025|News|Comments Off|

Permalink Gallery
MCL Research on Video-Text Alignment

MCL Research on Video-Text Alignment

This work presents an interpretable and computationally efficient Video-Text Alignment (VTA) framework for cross-modal retrieval. Unlike end-to-end multimodal models that rely on large, opaque latent spaces and scale poorly with video length, VTA decomposes the retrieval process into transparent, modular components. The system first performs keyframe selection to reduce redundant temporal information and conducts object detection to extract salient visual concepts. In parallel, captions are analyzed through part-of-speech tagging to identify meaningful nouns and verbs. By modeling co-occurrence statistics between detected objects and textual keywords, VTA prunes irrelevant candidates early, dramatically reducing the search space and the number of trainable parameters—only 3% of those required for CLIP-based fine-tuning—while keeping encoders frozen to avoid overfitting to limited video-text datasets.

After filtering, VTA further improves retrieval through genre-based clustering and lightweight contrastive learning modules specialized to semantically coherent subsets of the data. These modules employ simple linear projections on top of frozen encoders, yet achieve competitive accuracy by focusing on more homogeneous sample groups. The entire pipeline is interpretable, as it provides explicit intermediate decisions, such as detected objects, POS-tagged keywords, genre predictions, and conditional probability estimates that link the two modalities. Experiments on the MSR-VTT benchmark demonstrate that VTA matches or surpasses state-of-the-art non-LLM baselines in both video-to-text and text-to-video retrieval, while offering constant-time inference and clear insight into its decision-making process.

By Catherine Aurelia Christie Alexander|November 23rd, 2025|News|Comments Off|

Permalink Gallery
MCL Research on Kidney Segmentation

MCL Research on Kidney Segmentation

Kidney cancer is one of the most common malignancies of the urinary system. To detect kidney cancer, we must identify kidney tumors, which can vary significantly in size, shape, and biological behavior, ranging from benign lesions to aggressive malignant tumors that require timely diagnosis and treatment. Accurate identification and segmentation of kidney tumors on CT or MRI scans are essential for clinical decision-making, surgical planning, and prognosis assessment.

We have recently been developing a framework called Green U-shaped Learning (GUSL) for kidney and tumor segmentation. This is a two-stage framework. In Stage 1, we performed segmentation of the kidney organ, and in Stage 2, we cropped the kidney region as the region of interest (ROI) for tumor segmentation. We employ our GUSL framework on both stages, which could achieve fine-to-coarse feature extraction and coarse-to-fine residual correction.

By Jiaxin Yang|November 16th, 2025|News|Comments Off|

Permalink Gallery
MCL Research on Seismic Data Processing

MCL Research on Seismic Data Processing

Earthquakes generate seismic waves that travel through the Earth’s interior, carrying critical information about the Earth’s structure and the source event. These waves are mainly divided into two types: P-waves and S-waves. P-waves, or primary waves, travel fastest and arrive first at seismic stations, while S-waves, or secondary waves, follow with a slower speed but higher amplitude, often causing greater damage. Accurately identifying the arrival times of these waves — a process known as phase picking — is fundamental for earthquake localization, magnitude estimation, and early warning systems. However, seismic recordings are often contaminated by complex background noise, overlapping signals, and variable station conditions, which make manual picking both time-consuming and subjective. Automated phase picking, therefore, plays an increasingly vital role in modern seismology, enabling real-time earthquake detection and large-scale waveform analysis. Despite significant progress, achieving high precision and robustness across diverse seismic environments remains a major challenge for automated systems.

GreenPhase is a multi-resolution Green Learning framework for seismic phase picking. It aims to achieve high accuracy while maintaining interpretability and computational efficiency. The model operates across three resolution levels — from coarse to fine — progressively narrowing down candidate regions for P and S arrivals. At each level, GreenPhase extracts spectral–temporal features using the Saab transform and refines them through supervised feature selection and XGBoost regression. A pseudo-label generation and balanced sampling strategy further enhance training stability. Compared with deep networks such as PhaseNet and EQTransformer, GreenPhase requires far fewer training samples while achieving comparable performance. It is trained in a fully feedforward manner, balancing performance, efficiency, and interpretability. For the detection task, GreenPhase achieves an F1 score of 1.00; for P-phase picking, 0.98; and for S-phase picking, 0.96. The differences from EQTransformer, [...]

By Yixing Wu|November 9th, 2025|News|Comments Off|

Permalink Gallery
MCL Research on Mice Navigation Pattern Strategies

MCL Research on Mice Navigation Pattern Strategies

Understanding how animals navigate and learn from their environments has long been a central question in neuroscience. The Morris Water Maze (MWM) is one of the most widely used paradigms for studying spatial learning and memory in rodents. Traditionally, researchers have assessed performance using simple metrics such as escape latency and total path length. While these measures quantify task efficiency, they fail to capture the diversity of navigation strategies that rodents employ within a single trial. Recent work has revealed that mice often transition dynamically between different behavioral modes, such as thigmotaxis, scanning target, and direct search, suggesting that whole-trajectory analyses may overlook important within-trial strategy shifts.

To address this, we develop a Green Learning (GL) based framework to classify sub-trajectories into expert-defined behavioral strategies in an energy-efficient and interpretable manner. Unlike deep learning methods that depend on large datasets, backpropagation, and heavy computation, GL employs a feedforward-designed architecture emphasizing energy efficiency, logical transparency, and interpretability.

Our framework integrates trajectory segmentation with the three modules of GL, representation learning, feature learning, and supervised decision making. The segmentation stage divides each swimming path into overlapping sub-trajectories, allowing fine-grained behavioral classification rather than treating the entire trial as a single unit. We then extract geometric and temporal features, which serve as inputs for representation learning. Subsequently, Discriminant Feature Test (DFT) and Least-squares Normal Transform (LNT) modules identify the most informative and interpretable features for distinguishing strategies. Finally, Subspace Learning Machines (SLM) and ensemble classifiers perform supervised decision learning with minimal computational cost.

Through this interpretable and sustainable approach, we aim to uncover subtle behavioral differences between experimental and control groups while reducing energy consumption and improving transparency in AI-driven behavioral analysis. The proposed framework not only advances rodent behavioral [...]

By Claire Wang|November 2nd, 2025|News|Comments Off|

Permalink Gallery
MCL Research on 3D Whole-Brain Image Analysis in Mice

MCL Research on 3D Whole-Brain Image Analysis in Mice

By Zijing Chen|October 26th, 2025|News|Comments Off|

Permalink Gallery
MCL Research on EEG Analysis

MCL Research on EEG Analysis

Our study focuses on EEG-based analysis of Alzheimer’s disease (AD) and related disorders using a Green-Computing AI framework. The method relies on coherence matrices to quantify functional connectivity between 19 scalp electrodes across five standard frequency bands (delta, theta, alpha, beta, gamma). Each coherence matrix reflects the degree of synchronization between pairs of brain regions, providing a compact representation of neural interaction patterns.

After computing the coherence matrices, we apply the Discriminant Feature Test (DFT) to identify the most informative features for distinguishing disease groups. DFT ranks all coherence features according to their discriminative power, measured by entropy-based separability across classes. The top-ranked (K_n) features from the upper triangle of the matrix are retained as raw discriminative features.

For each electrode pair with at least (m) (1 < m ≤ 5) selected band features, we further derive two complementary representations:

Linear Normal Transform (LNT) features, which map the selected coherence values into a linearly separable subspace; and

Support Vector Machine (SVM) features, which capture nonlinear decision boundaries between classes.

The final feature vector combines the selected raw, LNT, and SVM features. Three binary classifiers—AD vs CN, AD vs FTD, and FTD vs CN—are trained using a leave-one-out cross-validation scheme. For each test subject, the outputs from the relevant binary classifiers are averaged to form the final multi-class prediction.

By Qi Cao|October 19th, 2025|News|Comments Off|

Previous 1 234 5 Next

News