USC Media Communications Lab

Permalink Gallery
Professor Kuo met MCL Alumni in Taiwan

Professor Kuo met MCL Alumni in Taiwan

Professor C.-C. Jay Kuo, Director of MCL, attended the Picture Coding Symposium (PCS) in Taichung, Taiwan, from June 12-14, 2024. Professor Kuo was invited to be a panelist for a panel program. The discussion topic was “Learned Image and Video Coding: Hype or Hope?” (see https://2024.picturecodingsymposium.org/panel/). All six panelists were optimistic about researching and developing learned image and video coding technologies for various reasons. Professor Kuo emphasized the differences between the classical and the modern learned coding methodologies. The former considers intra-content redundancy removal, while the latter examines inter-content redundancy removal. The former has been researched for four decades and reached its maturity. It isn’t easy to push further. The latter does have more opportunities. On the other hand, Professor Kuo was concerned about the high complexity and black-box nature of neural network codecs. He suggested an alternative non-neural-network-based approach to implement learned image and video codecs.After PCS, Professor Kuo went to Taipei for a reunion luncheon with MCL alums on June 15 (Saturday). Professor Kuo said, “It was a relaxing time during a busy week. Seeing our alums doing well in their careers and families was great.”

By Mahtab Movahhedrad|June 16th, 2024|News|Comments Off|

Permalink Gallery
MCL Research on Nuclei Segmentation for Histological Images

MCL Research on Nuclei Segmentation for Histological Images

Nuclei segmentation is a fundamental task required to analyze the underlying nuclei structure of an organ of interest. Cancer starts from the cells, and understanding the nuclei shapes, sizes and distribution can provide cues on whether or not a patient has cancer. Further analysis can also help in cancer grading and prognosis. However, studying whole slide images of biopsied tissues requires a large amount of time and effort.

Such monotonous and laborious tasks can be simplified by using AI and ML, and can perhaps improve the accuracy of the detections as well. Some challenges in the nuclei segmentation task include inherent staining variations in the WSI, a wide variety of shapes and sizes in nuclei, and irregular boundaries which make it difficult to track the actual contours. Most of the current research in this area involves deep learning based architectures like the U-Net, R-CNN, and even Vision Transformer. These methods require a large number of training samples and high complexity to achieve generalization among the variations inherent in nuclei from different organs.

We propose to use a light-weight, interpretable, and simple Green Learning based approach to perform Nuclei Segmentation. Prior work on highly effective Unsupervised Nuclei Instance Segmentation (HUNIS) [1] forms the first stage of our current approach. To further improve HUNIS results, we now focus on the regions where HUNIS requires the help of labels. We divide our current task into two stages: (i) to identify those areas where we need help and (ii) to correct those areas towards their actual class. With the help of Saab Transform, our main task now is to perform feature engineering to identify the ideal features to implement the above two stages.

References:

[1] V. Magoulianitis, Y. Yang, and C.-C. J. [...]

By Mahtab Movahhedrad|June 9th, 2024|News|Comments Off|

Permalink Gallery
MCL Research on Seismic Data Processing

MCL Research on Seismic Data Processing

Seismic data processing involves detecting earthquake signals and picking seismic phases from the diverse types of signals recorded by seismographs. During a seismic event, energy radiates from the focus (or hypocenter) as waves travel in all directions. These waves are categorized into body waves and surface waves. Understanding and accurately detecting these waves are crucial for rapid response and seismic hazard assessment.

Types of Seismic Waves

Body Waves: These waves travel through the Earth’s interior and are divided into two types:

P Waves (Primary Waves): The fastest waves, P waves are the first to be recorded on seismographs. They can travel through both solid and liquid media, causing the ground to move forward and backward.

S Waves (Secondary Waves): Following the P waves, S waves travel more slowly and cause a swinging motion that moves the ground up and down. S waves only travel through solid materials.

Surface Waves: Generated when body waves reach the Earth’s surface, surface waves spread out over the Earth’s surface. They are typically more destructive and damaging than body waves due to their larger amplitude and longer duration.

Importance of Seismic Phase Picking

Modern seismic networks continuously generate vast amounts of data. Manual analysis of this data is impractical due to the need for rapid response. Additionally, seismic data often contain significant noise and ambiguous signals, complicating interpretation. Accurate and efficient detection and phase picking are vital for reliable seismic event characterization, crucial for understanding seismic hazards and responding to potentially damaging earthquakes.

Deep Learning Approaches

Several deep learning (DL) models have been developed for seismic phase picking, including:

Generalized Phase Detector [1]

PhaseNet [2]

Earthquake Transformer [3]

These models achieve high accuracy in P-phase picking and can accurately identify the arrival of S waves, even when overlapped with the coda of [...]

By Mahtab Movahhedrad|June 2nd, 2024|News|Comments Off|

Permalink Gallery
MCL Research on Point Cloud Surface Reconstruction

MCL Research on Point Cloud Surface Reconstruction

Surface reconstruction from point cloud scans plays a pivotal role in 3D vision and graphics, with diverse applications in areas such as AR/VR games, heritage preservation, and indoor/outdoor scene reconstruction. This task is inherently challenging due to the ill-posed nature of reconstructing continuous surfaces from discrete points. Furthermore, real-world point cloud scans introduce several obstacles, such as varying densities, sensor noise, and missing parts. These properties make the problem a long-standing one, continually driving researchers to seek more effective solutions.

Early research focused on constructing watertight objects using combinatorial methods [1][2], which inferred the connectivity between points directly. The mainstream of surface reconstruction adopts an implicit surface approach [3][4], where the surface is represented as an unknown continuous function solved by associated partial differential equations (PDEs). Although these methods offer good quality, they are constrained to predicting watertight objects and cannot handle highly distorted LiDAR scenes. Recently, indoor/outdoor scene reconstruction has gained more attention, with deep learning (DL) models [5] demonstrating success in solving this problem based on a supervised learning framework.

Despite their high reconstruction quality, DL models face challenges in generalizability and complexity. In scenarios such as point cloud compression, quality assessment, and dynamic point cloud processing, there is a growing need for low-complexity, low-latency surface reconstruction methods. However, existing DL-based methods often sacrifice simplicity for high reconstruction quality, leaving a gap for low-complexity solutions. We aim to develop an unsupervised few-shot learning method to achieve reconstruction for scenes with low complexity.

Building on our previous unsupervised framework (GPSR), we propose an enhanced version to handle non-watertight indoor/outdoor scenes, named Green Point Cloud Surface Reconstruction++ (GPSR++). The main idea involves building an unsigned distance field (UDF) through approximated heat diffusion and optimizing the surface using [...]

By Mahtab Movahhedrad|May 26th, 2024|News|Comments Off|

Permalink Gallery
Congratulations to Chengwei Wei for Passing His Defense

Congratulations to Chengwei Wei for Passing His Defense

Congratulations to Chengwei Wei for passing his defense. Chengwei’s thesis is titled “Syntax-aware natural language processing techniques and their applications.” His Dissertation Committee includes Jay Kuo (Chair), Antonio Ortega, and Swabha Swayamdipta (Outside Member). The Committee members were impressed by the wide range of topics conducted in Chengwei’s Ph.D. research. Many thanks to our lab members for participating in his rehearsal and providing valuable feedback. The MCL News team invited Chengwei for a short talk on his thesis and PhD experience. Here is the summary. We thank Chengwei for his kind sharing and wish him all the best on his next journey. A high-level abstract of Chengwei’s thesis is given below:

”Syntax in language processing controls the structure of textual data, playing a crucial role in textual data understanding and generation. For example, syntax in natural language sentences governs the relationships between words, which is crucial for grasping the sentence’s overall meaning. In this thesis, we focus on two primary objectives: 1) Develop efficient methods for constructing syntactic structures. 2) Investigate the significance of syntax and integrate syntax-aware techniques into various Natural Language Processing (NLP) applications, spanning from word-level, sentence-level, document-level, and structured-data-level tasks.”

Chengwei shared his Ph.D. experience at MCL as follows :

I would first like to express my gratitude to Prof. Kuo for his guidance, patience, and unwavering support throughout this journey. His passion for research has been truly inspiring, leading me to join the lab in the summer of 2019 and driving me to pursue excellence in my academic endeavors. I am also thankful for his visionary insights into future research directions, exemplified by our collaborative effort on a survey paper on language models in the summer of 2022. This collaboration proved prescient, [...]

By Mahtab Movahhedrad|May 19th, 2024|News|Comments Off|

Permalink Gallery
Congratulations on MCL Members Attending Ph.D. Hooding Ceremony

Congratulations on MCL Members Attending Ph.D. Hooding Ceremony

Three MCL members attended the Viterbi PhD hooding ceremony on Wednesday, May 8, 2024, in the Bovard Auditorium. They were Zhanxuan Mei, Chengwei Wei and Wei Wang. Congratulations to them for their accomplishments in completing their PhD program at USC!

Zhanxuan Mei received his Bachelor’s degree in Electrical Engineering from Beijing Institute of Technology, China, in June 2018. He joined the Media Communication Lab in the summer of 2020. His research interests include image processing and video processing.

Chengwei Wei received his bachelor’s degree at Central South University, China in Jun 2018. He joined the Media Communications Lab in the summer of 2019. His research interests include signal processing, natural language processing, and machine learning. His thesis is titled “Syntax-aware Natural Language Processing Techniques and Their Applications”.

Wei Wang received her bachelors in Applied Physics from Northeastern University (CN), and her MS degree in Materials Engineering from the University of Southern California in 2014 and 2016, respectively.. Her research interests include deep learning and image processing.

Congratulations to them all! Let us wish them all the best in the future!

By Mahtab Movahhedrad|May 12th, 2024|News|Comments Off|

Permalink Gallery
Welcome New MCL Member Dingyi Nie

Welcome New MCL Member Dingyi Nie

We are so happy to welcome a new MCL member, Dingyi Nie joining MCL this semester. Here is a quick interview with Dingyi:

1. Could you briefly introduce yourself and your research interests?

My name is Dingyi Nie. I am a current Master of Science student in Computer Science at USC. I am joining MCL as a research intern starting from April 2024. My research interests mainly include digital signal processing and machine learning, particularly real-world AI. In my spare time, I enjoy music and sports. I’m a keyboard and drum player, and I play soccer and volleyball.

2. What is your impression about MCL and USC?

I got to know Professor Jay Kuo and his MCL from my professor who teaches multimedia systems. I personally have a long-standing interest in multimedia, DSP and machine learning. I find MCL’s focus on Green Learning particularly fascinating because it seamlessly integrates these areas. I am very excited to explore its potential as an emerging tool. I love LA as a city of culture and diversity, and I feel that USC is a good reflection of it. I am excited to interact with all the creative people here.

3. What are your future expectations and plans in MCL?

Starting from the fall semester, I will be working with Yixing Wu on a project exploring Green Learning solutions for irregularly sampled time series modeling. I look forward to building connections with all the members in the lab.

By Mahtab Movahhedrad|May 5th, 2024|News|Comments Off|

Permalink Gallery
MCL Research on Parsing Tree Construction

MCL Research on Parsing Tree Construction

Syntactic parsing is a natural language processing technique used to analyze the grammatical structure of a sentence. There are typically two syntactic parsings, dependency parsing and constituency parsing. Fig. 1 shows the parse trees corresponding to dependency parsing and constituency parsing, respectively. Dependency parsing identifies the dependency relationships between the words in a sentence and creates a directed graph representing these dependency relationships. In dependency parsing, each word in the sentence is represented as a node in the graph, and the dependency relationships between the words are represented as edges. The edges are labeled with the type of dependency relationship between the words, such as subject, object, or modifier. The resulting graph is called a dependency tree or a dependency graph. Constituency parsing is the process of analyzing a sentence to identify its syntactic structure and hierarchical organization based on the grammatical rules of a language. In constituency parsing, a sentence is divided into a hierarchy of phrases, each of which has a specific grammatical structure and serves a particular function within the sentence. These phrases are called constituents, and they can include nouns, verbs, adjectives, prepositions, and other parts of speech.

In this project, we aim to propose a simple but effective constituency parsing construction method. The constituency parse tree is first converted to the binary tree where an example is shown in Fig. 2. The core idea behind the method is that once we know the interval height between adjacent words, the binarized constituency parse tree can be constructed [1]. Instead of directly predicting the height, necessitates a complex model for concise prediction, presently, we have trained a binary classifier to compare the height of the intervals pairwisely. Then the height of an [...]

By Mahtab Movahhedrad|April 28th, 2024|News|Comments Off|

Permalink Gallery
MCL Research on Video Camouflaged Object

MCL Research on Video Camouflaged Object

Camouflage object detection (COD)is a challenging task that aims to identify targets “seamlessly” concealed within their surrounding environment, presenting a more challenging task compared to traditional object detection[1]. While under the Video camouflage object detection (VCOD), the intrinsic high variations and increased complexity in the scene poses new obstacles for the detection with videos. We proposed a green method, termed GreenCOD, that leverages gradient boosting and deep features extracted from pre-trained Deep Neural Networks (DNNs), efficiently detects the camouflage objects without back-propagation. In this quarter, based on the GreenCOD model, we further move on to explore how to deal with object detection in camouflage videos in a light-weighted, explainable way.

Inspired by the GreenCOD pipeline, our architecture integrates the EfficientNetB4 backbone in the feature extraction module for each frame. Initially, the input frames are first reshaped into a standard size of 672x672x3 and the processed through the 8-block EfficientNetB4 backbone for feature extraction under different size of reception fields. To well consider the information across different reception spatial sizes, all the features are resized and concatenated to form a rich set of features. A hierarchical architecture is implemented in decision learning module.

Then XGBoost is trained based on the initial prediction maps and temporal information. The short term temporal information is considered by extracting the motion among consecutive frames. The motion flow maps are extracted at a higher resolution, then followed by the utilization of neighborhood reconstruction. This approach ensures that each prediction location takes into account the information from a corresponding 4×4 window in the motion map. The initial result show a satisfying result on VCOD problems.

[1] Fan, Deng-Ping, et al. “Camouflaged object detection.” Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. [...]

By Mahtab Movahhedrad|April 21st, 2024|News|Comments Off|

Permalink Gallery
MCL Research on Green Learning for Electronic Design Automation (EDA)

MCL Research on Green Learning for Electronic Design Automation (EDA)

Recently, machine learning and AI have been applied to several electronic design automation (EDA) tasks [1], such as performance prediction, decision-making for designs, and automated design. The data-driven optimization processes provide an alternative approximation solution for NP-complete problems in EDA. However, there are still several challenges to applying machine learning algorithms in EDA problems. First, due to the protection of intellectual property (IP), it is difficult to access huge amounts of public datasets as training data. The deep learning framework relies on pre-trained models and fine-tuning techniques on small datasets. However, this approach demands high computational resources and large model size. Second, the end-to-end optimization in deep learning is viewed as a black box that lacks interpretability for making decisions in hardware designs. As a result, we aim to develop a green learning algorithm to mitigate the high demand for large amounts of training data with explainable results for EDA problems.

Currently, we propose a green learning architecture to address the IR-drop prediction problem. We parse netlist files into 2D format and extract features by our green learning framework automatically. Then, we select discriminative features as the input of XGboost to regress the IR-drop value. We aim to estimate the IR-drop value accurately while keeping a small model size and Flops number in an energy-efficient way.

Reference

[1] Huang, Guyue, et al. “Machine learning for electronic design automation: A survey.” ACM Transactions on Design Automation of Electronic Systems (TODAES) 26.5 (2021): 1-46.

By Mahtab Movahhedrad|April 14th, 2024|News|Comments Off|

Previous 5 678 9 Next

News