Let us hear what she wants to say about her defense and an abstract of her thesis.
Deep learning techniques utilize networks with multiple layers cascaded to map the inputs to desired outputs. To map the entire inputs to desired outputs, useful information should be extracted through the layers. During the mapping, feature extraction and prediction are jointly performed. We do not have direct control for feature extraction. Consequently, some useful information, especially local information, is also discarded in the process.
In this thesis, we specifically study local-aware deep learning techniques from four different aspects: 1) Local-aware network architecture 2) Local-aware proposal generation 3) Local-aware region analysis 4) Local-aware supervision
Specifically, we design a multi-modal attention mechanism for generative visual dialogue system, which simultaneously attends to multi-modal inputs and utilizes extracted local information to generate dialogue responses. We propose a proposal network for fast face detection system for mobile devices, which detects salient facial parts and uses them as local cues for detection of entire faces. We extract representative fashion features by analyzing local regions, which contain local fashion details of humans’ interests. We develop a fashion outfit compatibility learning method, which models each outfit as a graph and learns outfit compatibility using both global and local supervisions on the graphs.
I would like to thank Prof. Kuo and all the lab members for their help. I have learned a lot through my PhD journey and I want to share some feelings and experiences. One essential part of the doctoral training is mental training, from which I have become more persistent, self-disciplined and motivated. As this journey may take several years, maintaining a balanced life is very important. I wish the best to all the lab members and hope everyone enjoys the life here at MCL.